16 items tagged "data management"

  • 4 Tips to help maximize the value of your data

    4 Tips to help maximize the value of your data

    Summer’s lease hath all too short a date.

    It always seems to pass by in the blink of an eye, and this year was no exception. Though I am excited for cooler temperatures and the prismatic colors of New England in the fall, I am sorry to see summer come to an end. The end of summer also means that kids are back in school, reunited with their friends and equipped with a bevy of new supplies for the new year. Our kids have the tools and supplies they need for success, why shouldn’t your business?

    This month’s Insights Beat focuses on additions to our team’s ever-growing body of research on new and emerging data and analytics technologies that help companies maximize the value of their data.

    Get real about real time

    Noel Yuhanna and Mike Gualtieri published a Now Tech article on translytical data platforms. Since we first introduced the term a few years ago, translytical data platforms have been a scorching hot topic in database technology. Enabling real-time insights is imperative in the age of the customer, and there are a number of vendors who can help you streamline your data management. Check out their new report for an overview of 18 key firms operating in this space, and look for a soon-to-be-published Forrester Wave™ evaluation in this space, as well.

    Don’t turn a blind eye to computer vision

    Interested in uncovering data insights from visual assets? Look no further than computer vision. While this technology has existed in one form or another for many years, development in convolutional neural networks reinvigorated computer vision R&D (and indeed established computer vision as the pseudo-progenitor of many exciting new AI technologies). Don’t turn a blind eye to computer vision just because you think it doesn’t apply to your business. Computer vision already has a proven track record for a wide variety of use cases. Kjell Carlsson published a New Tech report to help companies parse a diverse landscape of vendors and realize their (computer) vision.

    Humanize B2B with AI

    AI now touches on virtually all aspects of business. As techniques grow more and more sophisticated, so too do its use cases. Allison Snow explains how B2B insights pros can leverage emerging AI technologies to drive empathy, engagement, and emotion. Check out the full trilogy of reports and overview now. 

    Drive data literacy with data leadership

    Of course, disruptive changes to data strategy can be a hard sell, especially when your organization lacks the structural forces to advocate for new ideas. Jennifer Belissent, in a recent blog, makes the case for why data leadership is crucial to driving better data literacy. Stay tuned for her full report on data literacy coming soon. More than just

    leadership, data and analytics initiatives require investment, commitment, and an acceptance of disruption. No initiative will be perfect from the get-go, and it’s important to remember that analytics initiatives don’t usually come with a magician’s reveal.

    Author: Srividya Sridharan

    Source: Forrester

  • Data management: compliance, protection, and the role of IT

    Data management: compliance, protection, and the role of IT

    The business benefit of data and data-driven decisions cannot be undervalued, which is a widely agreed-upon mindset on today’s business landscape. At the same time, there are sensitivities around where that data comes from and how it’s being accessed or used. For this reason, data protection and privacy are the driving topics in today’s age and, for enterprise companies, essential to remaining an ongoing business concern.

    To ensure regulatory compliance and generate business value, any data coming into an organization needs to be confidentially handled, trusted and protected. Modern businesses also want their products to be cloud deployable, but many businesses have security concerns that come with sharing information in the cloud. It’s crucial that when you use data, you also protect it, preserve the integrity of original personal ownership and, maintain the privacy of the person to whom it belongs at all costs.

    The first level of data protection is to not collect personal data if there is no legitimate purpose in doing so. If personal data was collected and a legitimate purpose no longer exists, it must be deleted.

    The second level of data protection can be realized through a framework of technology measures: Identity and access management, patch management, separation of business purpose (disaggregation of legal entities), and encryption.

    IT teams often provide data in an encrypted format as a means to get people the information they need, without compromising sensitive information. People receiving the data don’t usually need to know every bit of data, they just want an aggregate of what the data looks like. And IT teams want to ensure that when they transfer important data assets, the information is secure.

    Additionally, when it comes to being data compliant, there are rules and regulations that businesses must follow, such as the General Data Protection Regulation (GDPR) and data protection and privacy agreements.

    GDPR harmonizes data protection regulation throughout the European Union and gives individuals more control over their data. It imposes expansive rules about processing data backed by powerful enforcement, so IT teams must ensure they are compliant. This creates an extra, guaranteed level of security for corporate and personal data, though it’s not without its complications for enterprises.

    Concretely, this means that companies have to technically ensure that only necessary sets move through ‘boundaryless’ end-to-end business scenarios. Here, we consider efficient data control in and through the context of comprehensive business processing for a declared purpose that is legally secured, including by consent of the individual that the data is related to.

    The business context and its technical rendering through customizing and configuration is central to the business capability of efficiently controlling data for purposes of data protection and privacy. Integrated services provide business context by showing information contained in any one data set that is linked to ordered business objects and business object types related to the data subject.

    Here, we have offered an embedded view of the data subject, which can be uniformly changed and managed in the context of a logical sequence of business events.

    Data management capabilities

    To further protect data and stay compliant, many IT teams have started with the approach of applying data management capabilities to encrypt and anonymize data without actually changing the data set. IT simply changes the way data is presented to ensure data is safe.

    One recent example is the adoption of the GDPR rules to be compliant with the legal regulations. In this case the data management capabilities must ensure that only the allowed data is shown and that protected personal data is hidden or deleted (information lifecycle management) without destroying required information and connections.

    By transitioning to what we call an 'intelligent organization', businesses can feed applications and processes with the data essential for the digital economy and intelligently connect people, data and processes safely and secure.

    Solutions offer customers comprehensive in-depth information about the places where their master data exists, which parts reside in which services, applications or systems, and how the data can be accessed, or they can even get direct access. Moreover, a clear picture of the complete master data set and all individual owners can be obtained, including rules for creating data consistency. This provides overall consistency, and the robustness that is required in a service-driven enterprise environment.

    Tiered levels of access

    Another tactic way of keeping data secure is for IT to work closely with each line of business to set tiered levels of access by creating a workflow scenario for first, second, third, and so on, access by individual persons to data within a specific line of business.

    In contrast to the more traditional model outlined above, IT teams can offer a tiered approach to authorization. Users have limited access based on transaction codes, organizational levels, etc., by assigning authorization roles through different lines of business.

    Best practices for data compliance and protection

    Both approaches outlined above allow businesses to review their data to determine the real value of it without compromising the security of the data.

    Overall, it’s important that data compliance is not only a tech topic, but a topic that should be discussed, rolled out, and followed company-wide. As 2019 comes to a close, companies must have a data compliance program in place, a data protection culture within their organizations and the ability for employees to understand the importance of change processes and tools to adhere to the new regulations.

    Including such aspects from the beginning can be a competitive advantage for companies and should be considered at an early stage. Not adhering to data protection and privacy rules and regulation can cause tremendous damage to a company’s image and reputation and can have a heavy financial impact.

    Author: Katrin Lehmann

    Source: Information-management

  • Data warehousing: ETL, ELT, and the use of big data

    Data warehousing: ETL, ELT, and the use of big data

    If your company keeps up with the trends in data management, you likely have encountered the concepts and definitions of data warehouse and big data. When your data professionals try to implement data extraction for your business, they need a data repository. For this purpose, they can use a data warehouse and a data lake.

    Roughly speaking, a data lake is mainly used to gather and preserve unstructured data, while a data warehouse is intended for structured and semi-structured data.

    Data warehouse modeling concepts

    All data in a data warehouse is well-organized, archived, and arranged in a particular way. Not all data that can be gathered from multiple sources reach a data warehouse. The source of data is crucial since it impacts the quality of data-driven insights and hence, business decisions.

    During the phase of data warehouse development, a lot of time and effort is needed to analyze data sources and select useful ones. It depends on the business processes, whether a data source has value or not. Data only gets into the warehouse when its value is confirmed.

    On top of that, the way data is represented in your database has a critical role. Concepts of data modeling in a data warehouse are a powerful expression of business requirements specific to a company. A data model determines how data scientists and software engineers will design, create, and implement a database.

    There are three basic types of modeling. Conceptual data model describes all entities a business needs information about. It provides facts about real-world things, customers, and other business-related objects and relations.
    The goal of creating this data model is to synthesize and store all the data needed to gain an understanding of the whole business. This model is designed for the business audience.

    Logical data model suits more in-depth data. It describes the structure of data elements, their attributes, and ways these elements interrelate. For instance, this model can be used to identify relationships between customers and products of interest for them. This model is characterized by a high level of clarity and accuracy.

    Physical data model describes specific data and relationships needed for a particular case as well as the way data model is used in database implementation. It provides a wealth of meta-data and facilitates visualizing the structure of a database. Meta-data can involve accesses, limitations, indexes, and other features.

    ELT and ETL data warehouse concepts

    Large amounts of data sorted for warehousing and analytics require a special approach. Businesses need to gather and process data to retrieve meaningful insights. Thus, data should be manageable, clean, and suitable for molding and transformation.

    ETL (extract, transform, load)and ELT (extract, load, transform) are the two approaches that have technological differences but serve the same purpose – to manage and analyze data.

    ETL is the paradigm that enables data extraction from multiple sources and pulling data into a single database to serve a business.

    At the first stage of the ETL process, engineers extract data from different databases and gather it in a single place. The collected data undergo transformation to take the form required for a target repository. Then the data come to a data warehouse or a target database.

    If to switch the letters 'T' and 'L', you get the ELT process. After the retrieval, the data can be loaded straight to the target database. The cloud technology enables large and scalable storage places, and massive datasets can be first loaded and then transformed as per the business requirements and needs.

    The ELT paradigm is a newer alternative to a well-established ETL process. It is flexible and allows fast processing speed to work with raw data. On the one hand, ELT requires special tools and frameworks, but on the other, it enables unlimited access to business data, thus saving BI and data analytics experts so much time.

    ETL testing concepts are also essential to ensure that data is loading in a data warehouse in a correct and accurate manner. This testing involves data verification at transitional phases. And before data reaches the destination, its quality and usefulness are already verified.

    Types of data warehouse for your company

    Different data warehouse concepts presuppose the use of particular techniques and tools to work with data. Basic data warehouse concepts also differ depending on a company’s size and purposes of using data.

    Enterprise data warehouse enables a unique approach to organizing, visualizing, and representing all the data across a company. Data can be classified by a subject and can be accessed based on this attribute.

    Data mart is a subcategory of a data warehouse designed for specific tasks in business areas such as retail, finance, and so forth. Data comes into a data mart straight from the sources.

    Operational data store satisfies the reporting needs within a company. It is updating in real time, which makes this solution best-suited for keeping in all business records.

    Big data and data warehouse ambiguity

    A data warehouse is an architecture that has proved to be valuable for data storing over the years. It involves data that has a defined value and can be used from the start to solve some business needs. Everyone can access this data, and the features of datasets are reliability and accuracy.

    Big data is a hyped field these days. It is the technology that allows retrieving data from heterogeneous sources. The key features of big data are volume, velocity or data streams, and a variety of data formats. Unlike a data warehouse, big data is a repository that can hold unstructured data as well.

    Companies seek to adopt custom big data solutions to unlock useful information that can help improve decision-making. These solutions help drive revenue, increase profitability, and cut customer churn thanks to the comprehensive information collected and available in one place.

    Data warehouse implementation entails advantages in terms of making informed decisions. It provides comprehensive insights into what is going on within a company, while big data can be in the shape of massive but disorganized datasets. However, big data can be later used for data warehousing.

    Running a data-driven business means dealing with billions of data on in-house, external operations, consumers, and regulations.

    Author: Katrine Spirina

    Source: In Data Labs

  • How data management can learn from basketball

    How data management can learn from basketball

    A data management plan in a company is not something that can be implemented in isolation by one department or a team in your organisation, it is rather a collective effort, similar to how different players perform in a basketball court.  

    From the smallest schoolyard to the biggest pro venue, from the simplest pickup game to the NBA finals, players, coaches, and even fans will tell you that having a game plan and sticking to it is crucial to winning. It makes sense; while all players bring their own talents to the contest, those talents have to be coordinated and utilized for the greater good. When players have real teamwork, they can accomplish things far beyond what they could achieve individually, even if they are nominally part of the squad. When team players aren’t displaying teamwork, they’re easy targets for competitors who know how to read their weaknesses and take advantage of them.

    Basketball has been used as an analogy for many aspects of business, from coordination to strategy, but among the most appropriate business activities that basketball most resembles is, believe it or not, data management. Perhaps more than anything, companies need to stick to their game plan when it comes to handling data: storing it, labeling it, and classifying it.

    A good data management plan could mean a winning season

    Without a plan followed by everyone in the organization, companies will soon find that their extensive collections of data are useless, just like the top talent a team manages to amass is useless without everyone on a team knowing what their role is. Failure to develop a data management plan could cost a company in time, and even in money. If data is not classified or labeled properly, search queries are likely to miss a great deal of it, skewing reports, profit and loss statements, and much more. 

    Even more worrying for companies is the need for an ability to produce data when regulators come calling. With the implementation of the European Union’s General Data Protection Regulation (GDPR), companies no longer have an option not to have a tight game plan for data management. According to GDPR rules, all EU citizens have 'the right to be forgotten', which requires companies to know what data they have about an individual, and demonstrate an ability to delete it to EU inspectors on demand. Those rules apply not just to companies in Europe, but to all companies that do business with EU residents as well. GDPR violators could be fined as much as €20 million, or 4% annual global turnover, whichever is greater.

    Even companies that have no EU clients or customers need to improve their data management game, because GDPR-style rules are moving stateside as well. California recently passed its own digital privacy law (set to go into effect in January), which gives state residents the right to be forgotten other states are considering similar laws. And with heads of large tech firms calling for privacy legislation in the U.S., it’s likely that federal legislation on the matter will be passed sooner than later.

    Data Management Teamwork, When and Where it Counts

    In basketball, players need to be molded to work together as a unit. A rogue player who decides that they want to be a 'shooting star' instead of following the playbook and passing when appropriate may make a name for themselves, but the team they are playing for is unlikely to benefit much from that kind of approach. Only when all the players work together, with each move complementing the other as prescribed by the game plan, can a team succeed.

    In data management, teams generate information that the organization can use to further its business goals. Data on sales, marketing, engagement with customers, praises and complaints, how long it takes team members to carry out and complete tasks, and a million other metrics all go into the databases and data storage systems of organizations for eventual analysis.

    With that data, companies can accomplish a great deal: Improve sales, make operations more efficient, open new markets, research new products and improve existing ones, and much more. That, of course, can only happen if all departments are able to access the data collected by everyone.

    Metadata management - A star 'player'

    Especially important is the data about data: the metadata, used to refer to data structures, labels, and types. When different departments, and even individual employees, are responsible for entering data into a repository, they need to follow the metadata 'game plan'.Tthe one where all data is being labeled according to a single standard, using common dictionaries, glossaries, and catalogs. Without that plan, data could easily get 'lost', and putting together search queries could be very difficult.

    Another problem is the fact that different departments will use different systems and products to process their data. Each data system comes with its own rules, and of course each set of rules is different. That there is no single system for labeling between the different products just contributes to the confusion, making resolution of metadata issues all the more difficult.

    Unfortunately, not everyone is always a team player when it comes to metadata. Due to pressure of time or other issues, different departments tend to use different terminology for data. For example, a department that works with Europe may label its dates in the form of year/month/day, while one that deals with American companies will use the month/day/year label. In a search form, the fields for 'years' and 'days' will not match across all data repositories, creating confusion. The department 'wins', but what about everyone else? And even in situations where the same terminology is used, the fact that different data systems are in use could impact metadata.

    Different departments have different objectives and goals, but team members cannot forget the overall objective: helping the 'team', the whole company, to win. The data they contribute is needed for those victories, those advancements. Without it, important opportunities could be lost. When data management isn’t done properly, teams may accomplish their own objectives, but the overall advancement of the company will suffer.

    'Superstars', whose objective is to aggrandize themselves, have no place on a basketball team; they should be playing one-on-one hoops with others of their type. Teams in companies should learn the lessonL if you want to succeed in basketball, or in data management, you need to work together with others, following the data plan that will ensure success for everyone.

    Author: Amnon Drori

    Source: Dataconomy

  • Leading your organization to success through Agile Data Governance

    Leading your organization to success through Agile Data Governance

    Laura Madsen wants to challenge your outdated ideas about Data Governance. “I’m pretty sure that we wouldn’t use software that we used 20 years ago, but we’re still using Data Governance and Data Governance methodologies the same way we did 20 years ago.” And although she advocates for Agile, she’s not an Agile coach or a SCRUM master; rather she wants companies to consider agility in a broader sense as well. “Very briefly, when we think about Agile, essentially, we think about reducing process steps.” She paraphrases David Hussman’s belief that there is no inherent value in “process” — process exists in order to prove to other people that “we’re doing something.” To that end, most organizations create an enormous number of process steps she refers to as “flaming hoops,” showing that there was a lot of work put into activities such as status updates, but nothing that provided actual value.

    Madsen is the author of Disrupting Data Governance, Chief Executive Guru at Via Gurus, and Mastermind at the Sisterhood of Technology Professionals (Sistech).

    Resource Use

    Historically, Data Governance has always been resource-intensive, and with Agile Data Governance in particular, she said, the most important resource is the individuals who do the work. The need for a data owner and a data steward for each domain, often with multiple stewards or owners covering the same data domain, etc., emerged as a system designed to serve data warehouses with hundreds of tables, and thousands of rows per table. “That’s a rather parochial idea in 2020, when we have petabytes of data blown through our data warehouses on any given day.”

    One resource-heavy relic from the past is the standing committee, which always starts off with a lot of participation and enthusiasm, but over time people disengage and participation dwindles. Another historical shortcoming in Data Governance is the reliance on one or two individuals who hold the bulk of the institutional knowledge. With the amount of risk attached to Data Governance processes, the people who serve as the governance linchpin are under a lot of pressure to do more with less, so when they leave, the Data Governance program often collapses.

    Instead, she recommends developing responsive and resilient capabilities by creating a dependency on multiple people with similar capabilities instead of one person who knows everything.

    To make best use of time and resources, committees should be self-forming and project-based. Distinct functions must be created for participating resources: “And we should be laser clear about what people are doing.”

    The Kitchen Sink

    Still another legacy from the past is the tendency to take a “kitchen sink” approach, throwing compliance, risk, security, quality, and training all under the aegis of Data Governance, creating a lack of clarity in roles. “When you do everything, then you’re really doing nothing,” she said. Data stewards aren’t given formal roles or capabilities, and as such, they consider their governance duties as something they do on the side, “for fun.”

    Madsen sees this as arising out of the very broad scope of the historical definition of Data Governance. Intersecting with so many different critical areas, Data Governance has become a catch-all. In truth, she said, instead of being wholly responsible for compliance, risk, security, protection, data usage, and quality, Data Governance lives in the small area where all of those domains overlap.

    She considers this narrower focus as essential to survival in modern data environments, especially now, when there are entire departments devoted to these areas. Expecting a Data Governance person to be fully accountable for compliance, privacy, risk, security, protection, data quality, and data usage, “is a recipe for absolute disaster.” Today, she said, there is no excuse for being haphazard about what people are doing in those intersecting areas.

    Four Aspects of Success

    To succeed, companies must move away from the kitchen sink definition of Data Governance and focus on four aspects:

    These categories will not need equal focus in every organization, and it’s expected that priorities will shift over time. Madsen showed a slide with some sample priorities that could be set with management input:

    • Increased data use at 40% importance
    • Quality at 25%
    • Management at 25%
    • Protection at 10%

    From an Agile perspective, every sprint or increment can be measured against those values, creating “an enormous amount of transparency.” And although executives may not care about the specific tasks used to address those priorities, they will care that they are being tackled strategically, she said.

    Increased Use of Data

    If the work of Data Governance isn’t leading to greater use of data, she says, “What the heck are we doing?” Building data warehouses, creating dashboards, and delivering ad hoc analytics are only useful if they enable greater data use. All governance activity should be focused toward that end. The only way to get broad support for Data Governance is to increase the usage of the data.

    Data Quality

    Record counts and data profiling can show what’s in the data and whether or not the data is right, but analysis is not the same as data quality. “What we’re really driving towards here is the context of the data,” Madsen said, which leads to increased data use. The core of Data Quality Management is ensuring it has meaning, and the only way for the data to have meaning is to provide context.

    Data Management

    She talks specifically about the importance of lineage within the context of Data Management. Most end users only interact with their data at the front end when they’re entering something, and at the back end, when they see it on a report or a dashboard. Everything that happens in between those two points is a mystery to them, which creates anxiety or confusion about the accuracy or meaning of the end result. “Without lineage tools, without the capability of looking at and knowing exactly what happened from the source to the target, we lose our ability to support our end users.” For a long time those tools didn’t exist, but now they do, and those questions can be answered very quickly, she said.

    Data Protection

    Although Data Governance has a part in mitigating risk and protecting data, again, these are areas where governance should not be fully responsible. Instead, governance should be creating what Madsen calls “happy alliances” with those departments directly responsible for data protection, and focusing on facilitating increased data usage. This is often reversed in many organizations: If data is locked down to the point where it’s considered “completely safe,” risk may be under control, but no one is using it.

    Moving into the Future/Sweeping Away the Past—Fixing Committees

    Committees, she said, are not responsive, they’re not Agile and they don’t contribute to a resilient Data Governance structure. Theoretically, they do create a communication path of sorts, because a standing meeting at least assumes participants are paying attention for a specific period of time — until they lose interest. 

    What works better, she said, is self-forming Scrum teams or self-forming Agile teams that are on-demand or project-based, using a “backlog” (list of tasks) that becomes the working document for completing the company’s project list. “You come together, you work on the thing, and then you go your own separate ways.”

    A sample self-forming Agile team might consist of a CDO, serving as a product owner, someone from

    security, privacy, and IT, which creates regulatory and IT standards, and executives from business departments like finance, sales, or operations, who might also serve assubject matter experts.

    The backlog serves as a centralized document where data issues are tracked, responsibilities are outlined and milestones on the way to completion are logged.

    Traditional concepts like data ownership and data stewardship still have a part, but they are in service to a project or initiative rather than a fixed area or department. When the project is completed, the team disbands.

    Named Data Stewards

    Named data stewards serve as a resource for a particular project or area, such as the customer data domain. Named data stewards or owners for each area of responsibility should be published so that anyone can quickly and easily find the data steward for any particular domain.

    On Demand Data Stewards

    “Everyone’s a data steward, just like everyone’s in charge of sales.” Anyone who has a question about the data and wants to know more is, in that moment, a data steward, she said, whether they are trained for it or not. By taking ownership of a question and being willing to find an answer, the “on-demand” steward gains the ability to help the organization do a better job in that particular moment. “Ownership is so integral to successful deployment of any data function in an organization.”

    Ensuring Success

    To sum up, Madsen recommends starting a backlog, using it to consistently document exit criteria (your definition of “done”), and committing to actively managing it. Start thinking like a Data Governance product owner, keep communications open among intersecting areas — those “happy alliances” — and keep the ultimate goal of increased data use in mind. Focus on progress over perfection, she says, “And then just keep swimming, just keep swimming …”

    Author: Amber Lee Dennis

    Source: Dataversity

  • Making your Organization more intelligent with a Cloud Data Strategy

    Making your Organization more intelligent with a Cloud Data Strategy

    At a time when most major companies are showing a long-range commitment to “data-driven culture,” data is considered the most prized asset. An Enterprise Data Strategy, along with aligned technology and business goals, can significantly contribute to the core performance metrics of a business. The underlying principles of an Enterprise Data Strategy comprise a multi-step framework, a well-designed strategy process, and a definitive plan of action. However, in reality, very few businesses today have their Data Strategy aligned with overall business and technology goals.

    Data Management Mistakes Are Costly

    Unless the overall business and technology goals of a business are aligned with a Data Strategy, the business may suffer expensive Data Management failure incidents from time to time. If the Data Strategy is implemented in line with a well-laid out action plan that seeks to transform the current state of affairs into “strategic Data Management initiatives” leading to the fulfillment of desirable business needs and objectives in the long term, then there is a higher chance of that Data Strategy achieving the desired outcomes. 

    Data provides “insights” that businesses use for competitive advantage. When overall business goals and technology goals are left out of the loop of an Enterprise Data Strategy, the data activities are likely to deliver wrong results, and cause huge losses to the business.

    What Can Businesses Do to Remain Data-Driven?

    Businesses that have adopted a data-driven culture and those expecting to do so, can invest some initial time and effort to explore the underlying relationships between the overall business goals, technology goals, and Data Strategy goals. The best part is they can use their existing advanced analytics infrastructure to make this assessment before drafting a policy document for developing the Data Strategy.

    This initial investment in time and effort will go a long way toward ensuring that the business’s core functions (technology, business, and Data Science) are aligned and have the same objectives. Without this effort, the Data Strategy can easily become fragmented and resource-heavy—and ineffective.

    According to Anthony Algmin, Principal at Algmin Data Leadership, “Thinking of a Data Strategy as something independent of Business Strategy is a recipe for disaster.”

    Data Governance has recently become a central concern for data-centric organizations, and all future Data Strategies will include Data Governance as a core component. The future Data Strategy initiatives will have to take regulatory compliances seriously to ensure long-term success of such strategies. The hope is that this year, businesses will employ advanced technologies like big data, graph, and machine learning (ML) to design and implement a strong Data Strategy.

    In today’s digital ecosystem, the Data Strategy means the difference between survival and extinction of a business. Any business that is thinking of using data as a strategic asset for predetermined business outcomes must invest in planning and developing a Data Strategy. The Data Strategy will not only aid the business in achieving the desired objectives, but will also keep the overall Data Management activities on track.

    A Parallel Trend: Rapid Cloud Adoption

    As Data Strategy and Data Governance continue to gain momentum among global businesses, another parallel trend that has surfaced is the rapid shift to cloud infrastructures for business processing.

    With on-premise Data Management practices, Cloud Data Management practices also revolve around MDM, Metadata Management, and Data Quality. As the organizations continue their journey to the cloud, they will need to ensure their Data Management practices conform to all Data Quality and Data Governance standards.

    A nagging concern among business owners and operators who have either shifted to the cloud or are planning a shift is data security and privacy. In fact, many medium or smaller operations have resisted the cloud as they are unsure or uninformed about the data protection technologies available on the cloud. Current businesses owners expect cloud service providers to offer premium data protection services.

    The issues around Cloud Data Management are many: the ability of cloud resources to handle high-volume data, the security leaks in data transmission pipelines, data storage and replication policies of individual service providers, and the possibilities of data loss from cloud hosts. Cloud customers want uninterrupted data availability, low latency, and instant recovery—all the privileges they have enjoyed so far in an on-premise data center.

    One technology solution often discussed in the context of cloud data protection is JetStream. Through a live webinar, Arun Murthy, co-founder and Chief Product Officer of Horton Works, demonstrated how the cloud needs to be a part of the overall Data Strategy to fulfill business needs like data security, Data Governance, and holistic user experience. The webinar proceedings are discussed in Cloud Computing—an Extension of Your Data Strategy.

    Cloud Now Viewed as Integral Part of Enterprise Data Strategy

    One of the most talked about claims made by industry experts at the beginning of 2017 was that it “would be a tipping point for the cloud.” These experts and cloud researchers also suggested that the cloud would bring transformational value to business models through 2022, and would become an inevitable component of business models. According to market-watcher Forrester, “cloud is no longer about cheap servers or storage, (but), the best platform to turn innovative ideas into great software quickly.

    As cloud enables big data analytics at scale, it is a popular computing platform for larger businesses who want the benefits without having to make huge in-house investments. Cloud holds promises for medium and small businesses, too, with tailor-made solutions for custom computing needs at affordable cost.

    The following points should be kept in mind while developing a strategy plan for the cloud transformation:

    • Consensus Building for Cloud Data Strategy: The core requirement behind building a successful Data Strategy for the cloud is consensus building between the central IT Team, the cloud architect, and the C-Suite executives. This problem is compounded in cases where businesses may be mix-matching their cloud implementations.
    • Data Architectures on Native Cloud: The news feature titled Six Key Data Strategy Considerations for Your Cloud-Native Transformation throws light on cloud-native infrastructure, which is often ignored during a business transformation. According to this article, though enterprises are busy making investments in a cloud-native environment, they rarely take the time to plan the transformation, thus leaving Data Architecture issues like data access and data movement unattended. 
    • Creating Data Replicas: Data replication on the cloud must avoid legacy approaches, which typically enabled data updating after long durations.
    • Data Stores across Multiple Clouds: HIT Think: How to Assess Weak Links in a Cloud Data Strategy specifically refers to storage of healthcare data, where data protection and quick data recovery are achieved through the provisioning of multiple cloud vendors. These solutions are not only cost-friendly, but also efficient and secure. 

    Author: Paramita (Guha) Ghosh

    Source: Dataversity

  • Managing data at your organization? Take a holistic approach

    Managing data at your organization? Take a holistic approach

    Taking a holistic approach to data requires considering the entire data lifecycle – from gathering, integrating, and organizing data to analyzing and maintaining it. Companies must create a standard for their data that fits their business needs and processes. To determine what those are, start by asking your internal stakeholders questions such as, “Who needs access to the data?” and “What do each of these departments, teams, or leaders need to know? And why?” This helps establish what data is necessary, what can be purged from the system, and how the remaining data should be organized and presented.

    This holistic approach helps yield higher-quality data that’s more usable and more actionable. Here are three reasons to take a holistic approach at your organization:

    1. Remote workforce needs simpler systems

    We saw a massive shift to work-from-home in 2020, and that trend continues to pick up speed. Companies like Twitter, Shopify, Siemens, and the State Bank of India are telling employees they can continue working remotely indefinitely. And according to the World Economic Forum, the number of people working remotely worldwide is expected to double in 2021.

    This makes it vital that we simplify how people interact with their business systems, including CRMs. After all, we still need answers to everyday questions like, “Who’s handling the XYZ account now?” and “How did customer service solve ABC’s problem?” But instead of being able to ask the person in the next office or cubicle, we’re forced to rely on a CRM to keep us up to date and make sure we’re moving in the right direction.

    This means team members must input data in a timely manner, and others must be able to access that data easily and make sense of it, whether it’s to view the sales pipeline, analyze a marketing campaign’s performance, or spot changes in customer buying behavior.

    Unfortunately, the CRMs used by many companies make data entry and analytics challenging. At best, this is an efficiency issue. At worst, it means people aren’t inputting the data that’s needed, and any analysis of spotty data will be flawed. That’s why we suggest companies focus on improving their CRM’s user interface, if it isn’t already user-friendly.

    2. A greater need for data accuracy

    The increased reliance on CRM data also means companies need to ramp up their Data Quality efforts. People need access to clean, accurate information they can act on quickly.

    It’s a profound waste of time when the sales team needs to verify contact information for every lead before they reach out, or when data scientists have to spend hours each week cleaning up data before they analyze it.

    Yet, according to online learning company O’Reilly’s The State of Data Quality 2020 report, 40% or more of companies suffer from these and other major Data Quality issues:

    • Poor quality controls when data enters the system
    • Too many data sources and inconsistent data
    • Poorly labeled data
    • Disorganized data
    • Too few resources to address Data Quality issues

    These are serious systemic issues that must be addressed in order to deliver accurate data on an ongoing basis.

    3. A greater need for automation

    Data Quality Management is an ongoing process throughout the entire data lifecycle. We can’t just clean up data once and call it done.

    Unfortunately, many companies are being forced to work with smaller budgets and leaner teams these days, yet the same amount of data cleanup and maintenance work needs to get done. Automation can help with many of the repetitive tasks involved in data cleanup and maintenance. This includes:

    • Standardizing data
    • Removing duplicates
    • Preventing new duplicates
    • Managing imports
    • Importing/exporting data
    • Converting leads
    • Verifying data

    A solid business case

    By taking a holistic approach to Data Management – including simplifying business systems, improving data accuracy, and automating whenever possible – companies can improve the efficiency and effectiveness of teams throughout their organization. These efforts will help organizations come through the pandemic stronger, with a “new normal” for data that’s far better than what came before.

    Author: Oilivia Hinkle

    Source: Dataversity

  • Master Data Management and the role of (un)structured data

    MasterDataManagementTraditional conversations about master data management’s utility have centered on determining what actually constitutes MDM, how to implement data governance with it, and the balance between IT and business involvement in the continuity of MDM efforts.

    Although these concerns will always remain apposite, MDM’s overarching value is projected to significantly expand in 2018 to directly create optimal user experiences—for customers and business end users. The crux of doing so is to globalize its use across traditional domains and business units for more comprehensive value.

    “The big revelation that customers are having is how do we tie the data across domains, because that reference of what it means from one domain to another is really important,” Stibo Systems Chief Marketing Officer Prashant Bhatia observed.

    The interconnectivity of MDM domains is invaluable not only for monetization opportunities via customer interactions, but also for streamlining internal processes across the entire organization. Oftentimes the latter facilitates the former, especially when leveraged in conjunction with contemporary opportunities related to the Internet of Things and Artificial Intelligence.

    Structured and Unstructured Data

    One of the most eminent challenges facing MDM related to its expanding utility is the incorporation of both structured and unstructured data. Fueled in part by the abundance of external data besieging the enterprise from social, mobile, and cloud sources, unstructured and semi-structured data can pose difficulties to MDM schema.

    After attending the recent National Retail Federation conference with over 30,000 attendees, Bhatia noted that one of the primary themes was, “Machine learning, blockchain, or IoT is not as important as how does a company deal with unstructured data in conjunction with structured data, and understand how they’re going to process that data for their enterprise. That’s the thing that companies—retailers, manufacturers, etc.—have to figure out.”

    Organizations can integrate these varying data types into a single MDM platform by leveraging emerging options for schema and taxonomies with global implementations, naturally aligning these varying formats together. The competitive advantage generated from doing so is virtually illimitable. 

    Original equipment manufacturers and equipment asset management companies can attain real-time, semi-structured or unstructured data about failing equipment and use that to influence their product domain with attributes informing the consequences of a specific consumer’s tire, for example. The aggregation of that semi-structured data with structured data in an enterprise-spanning MDM system can influence several domains. 

    Organizations can reference it with customer data for either preventive maintenance or discounted purchase offers. The location domain can use it to provide these services close to the customer; integrations with lifecycle management capabilities can determine what went wrong and how to correct it. “That IoT sensor provides so much data that can tie back to various domains,” Bhatia said. “The power of the MDM platform is to tie the data for domains together. The more domains that you can reference with one another, you get exponential benefits.”

    Universal Schema

    Although the preceding example pertained to the IoT, it’s worth noting that it’s applicable to virtually any data source or type. MDM’s capability to create these benefits is based on its ability to integrate different data formats on the back end. A uniformity of schema, taxonomies, and data models is desirable for doing so, especially when using MDM across the enterprise. 

    According to Franz CEO Jans Aasman, traditionally “Master Data Management just perpetuates the difficulty of talking to databases. In general, even if you make a master data schema, you still have the problem that all the data about a customer, or a patient, or a person of interest is still spread out over thousands of tables.” 

    Varying approaches can address this issue; there is growing credence around leveraging machine learning to obtain master data from various stores. Another approach is to considerably decrease the complexity of MDM schema so it’s more accessible to data designated as master data. By creating schema predicated on an exhaustive list of business-driven events, organizations can reduce the complexity of myriad database schemas (or even of conventional MDM schemas) so that their “master data schema is incredibly simple and elegant, but does not lose any data,” Aasman noted.

    Global Taxonomies

    Whether simplifying schema based on organizational events and a list of their outcomes or using AI to retrieve master data from multiple locations, the net worth of MDM is based on the business’s ability to inform the master data’s meaning and use. The foundation of what Forrester terms “business-defined views of data” is oftentimes the taxonomies predicated on business use as opposed to that of IT. Implementing taxonomies enterprise-wide is vital for the utility of multi-domain MDM (which compounds its value) since frequently, as Aasman indicated, “the same terms can have many different meanings” based on use case and department.

    The hierarchies implicit in taxonomies are infinitely utilitarian in this regard, since they enable consistency across the enterprise yet have subsets for various business domains. According to Aasman, the Financial Industry Bank Ontology can also function as a taxonomy in which, “The higher level taxonomy is global to the entire bank, but the deeper you go in a particular business you get more specific terms, but they’re all bank specific to the entire company.” 

    The ability of global taxonomies to link together meaning in different business domains is crucial to extracting value from cross-referencing the same master data for different applications or use cases. In many instances, taxonomies provide the basis for search and queries that are important for determining appropriate master data.

    Timely Action

    By expanding the scope of MDM beyond traditional domain limitations, organizations can redouble the value of master data for customers and employees. By simplifying MDM schema and broadening taxonomies across the enterprise, they increase their ability to integrate unstructured and structured data for timely action. “MDM users in a B2B or B2C market can provide a better experience for their customers if they, the retailer and manufacturer, are more aware and educated about how to help their end customers,” Bhatia said.


    Author: Jelani Harper

    Source: Information Management

  • Rubrik is data resilience leider in nieuwste Forrester report

    data science rubrik

    Rubrik is data resilience leider in nieuwste Forrester report

    In de nieuwste editie van het Forrester Wave-rapport over Data Resilience Solutions is Rubrik benoemd tot leider. De aanbieder op het gebied van multi-cloud data control kreeg zelfs de hoogste score toegekend op het gebied van strategie.

    Forrester heeft tien vendoren geëvalueerd op basis van veertig criteria, die weer zijn onderverdeeld in drie categorieën: huidige aanbod, strategie en aanwezigheid in de markt. Rubrik behaalde de hoogst mogelijke score op het gebied van strategie en security.

    'Rubrik past bij bedrijven die erop uit zijn om hun data resilience te vereenvoudigen, moderniseren en consolideren', aldus het rapport. Rubrik wordt omschreven als een ‘eenvoudige, intuïtieve en krachtige policy engine die de bescherming van data regelt, ongeacht het soort, de locatie of het doel van de data'.

    Volgens CEO Bipul Sinha van Rubrik laat de erkenning laat zien dat Rubrik goed is gepositioneerd om de transformatie van de data management-markt te leiden. 'Klanten stellen steeds hogere eisen aan data management-oplossingen, die verder gaan dan alleen back-up en recovery. Dat we de hoogste score hebben gekregen op het gebied van strategie bevestigt dat we op de juiste weg zijn om door middel van innovatie steeds beter te voldoen aan de vraag van onze klanten'.

    Bron: BI Platform

  • Solutions to help you deal with heterogeneous data sources

    Solutions to help you deal with heterogeneous data sources

    With enterprise data pouring in from different sources; CRM systems, web applications, databases, files, etc., streamlining data processes is a significant challenge as it requires integrating heterogeneous data streams. In such a scenario, standardizing data becomes a pre-requisite for effective and accurate data analysis. The absence of the right integration strategy will give rise to application-specific and intradepartmental data silos, which can hinder productivity and delay results.

    Consolidating data from disparate structured, unstructured, and semi-structured sources can be complex. A survey conducted by Gartner revealed that one-third of respondents consider 'integrating multiple data sources' as one of the top four integration challenges.

    Understanding the common issues faced during this process can help enterprises successfully counteract them. Here are three challenges generally faced by organizations when integrating heterogeneous data sources, as well as ways to resolve them:

    Data extraction

    Challenge: Pulling source data is the first step in the integration process. But it can be complicated and time-consuming if data sources have different formats, structures, and types. Moreover, once the data is extracted, it needs to be transformed to make it compatible with the destination system before integration.

    Solution: The best way to go about this is to create a list of sources that your organization deals with regularly. Look for an integration tool that supports extraction from all these sources. Preferably, go with a tool that supports structured, unstructured, and semi-structured sources to simplify and streamline the extraction process.

    Data integrity

    Challenge: Data Quality is a primary concern in every data integration strategy. Poor data quality can be a compounding problem that can affect the entire integration cycle. Processing invalid or incorrect data can lead to faulty analytics, which if passed downstream, can corrupt results.

    Solution: To ensure that correct and accurate data goes into the data pipeline, create a data quality management plan before starting the project. Outlining these steps guarantees that bad data is kept out of every step of the data pipeline, from development to processing.


    Challenge: Data heterogeneity leads to the inflow of data from diverse sources into a unified system, which can ultimately lead to exponential growth in data volume. To tackle this challenge, organizations need to employ a robust integration solution that has the features to handle high volume and disparity in data without compromising on performance.

    Solution: Anticipating the extent of growth in enterprise data can help organizations select the right integration solution that meets their scalability and diversity requirements. Integrating one data point at a time is beneficial in this scenario. Evaluating the value of each data point with respect to the overall integration strategy can help prioritize and plan. Say that an enterprise wants to consolidate data from three different sources: Salesforce, SQL Server, and Excel files. The data within each system can be categorized into unique datasets, such as sales, customer information, and financial data. Prioritizing and integrating these datasets one at a time can help organizations gradually scale data processes.

    Author: Ibrahim Surani

    Source: Dataversity

  • The (near) future of data storage

    The (near) future of data storage

    As data proliferates at an exponential rate, companies must not only store it. They must approach Data Management expertly and look to new approaches. Companies that take new and creative approaches to data storage will be able to transform their operations and thrive in the digital economy.

    How should companies approach data storage in the years to come? As we look into our crystal ball, here are important trends in 2020. Companies that want to make the most of data storage should be on top of these developments.

    A data-centric approach to data storage

    Companies today are generating oceans of data, and not all of that data is equally important to their function. Organizations that know this, and know which pieces of data are more critical to their success than others, will be in a position to better manage their storage and better leverage their data.

    Think about it. As organizations deal with a data deluge, they are trying hard to maximize their storage pools. As a result, they can inadvertently end up putting critical data on less critical servers. Doing so is a problem because it typically takes longer to access data on slower, secondary machines. It’s this lack of speed and agility that can have a detrimental impact on businesses’ ability to leverage their data.

    Traditionally organizations have taken a server-based approach to their data backup and recovery deployments. Their priority is to back up their most critical machines rather than focusing on their most business-critical data.

    So, rather than having backup and recovery policies based on the criticality of each server, we will start to see organizations match their most critical servers with their most important data. In essence, the actual content of the data will become more of a decision-driver from a backup point of view.

    The most successful companies in the digital economy will be those that implement storage policies based not on their server hierarchy but on the value of their data.

    The democratization of flash storage

    With the continuing rise of technologies like IoT, artificial intelligence, and 5G, there will be an ever-greater need for high-performance storage. This will lead to the broader acceptance of all-flash storage. The problem, of course, is that flash storage is like a high-performance car: cool and sexy, but the price is out of reach for most.

    And yet traditional disk storage simply isn’t up to the task. Disk drives are like your family’s old minivan: reliable but boring and slow, unable to turn on a dime. But we’re increasingly operating in a highly digital world where data has to be available the instant it’s needed, not the day after. In this world, every company (not just the biggest and wealthiest ones) needs high-performance storage to run their business effectively.

    As the cost of flash storage drops, more storage vendors, are bringing all-flash arrays to the mid-market and more organizations will be able to afford this high-performance solution. This price democratization will ultimately enable every business to benefit from technology.

    The repatriation of cloud data

    Many companies realize that moving to the cloud is not as cost-effective, secure, or scalable as they initially thought. They’re now looking to return at least some of their core data and applications to their on-premises data centers.

    The truth is that data volumes in the cloud have become unwieldy. And organizations are discovering that storing data in the cloud is not only more expensive than they thought but It’s also hard to access that data expeditiously due to the cloud’s inherent latency.

    As a result, it can be more beneficial in terms of cost, security, and performance to move at least some company data back on-premises.

    Now that they realize the cloud is not a panacea, organizations are embracing the notion of cloud data repatriation. They’re increasingly deploying a hybrid infrastructure in which some data and applications remain in the cloud, while more critical data and applications come back home to an on-premises storage infrastructure.

    Immutable storage for businesses of all sizes

    Ransomware will continue to be a scourge to all companies. Because hackers have realized that data stored on network-attached storage devices is extremely valuable, their attacks will become more sophisticated and targeted. This is a serious problem because backup data is typically the last line of defense. Hackers are also attacking unstructured data. The reason is that if the primary and secondary (backup) data is encrypted, businesses will have to pay the ransom if they want their data back. This increases the likelihood that an organization, without a specific and immutable recovery plan in place, will pay a ransom to regain control over its data.

    It is not a question of if, but when, an organization will need to recover from a ‘successful’ ransomware attack. Therefore, it’s more important than ever to protect this data with immutable object storage and continuous data protection. Organizations should look for a storage solution that protects information continuously by taking snapshots as frequently as possible (e.g., every 90 seconds). That way, even when data is overwritten, older objects remain as part of the snapshot: the original data. That way, even when data is overwritten,there always will be another, immutable copy of the original objects that constitute the company’s data that can be instantly recovered… even if it’s hundreds of terabytes.

    Green storage

    Global data centers consume massive amounts of energy, which contributes to global warming. Data centers now eat up around 3% of the world’s electricity supply. They are responsible for approximately two percent of global greenhouse gas emissions. These numbers put the carbon footprint of data centers on par with the entire airline industry.

    Many companies are seeking to reduce their carbon footprint and be good corporate citizens. As part of this effort, they are increasingly looking for more environmentally-friendly storage solutions, those that can deliver the highest levels of performance and capacity at the lowest possible power consumption.

    In 2020, organizations of all sizes will work hard to get the most from the data they create and store. By leveraging these five trends and adopting a modern approach to data storage, organizations can more effectively transform their business and thrive in the digital economy.

    The ‘Prevention Era’ will be overtaken by the ‘Recovery Era’

    Organizations will have to look to more efficient and different ways to protect unstructured and structured data. An essential element to being prepared in the ‘recovery era’ will involve moving unstructured data to immutable object storage with remote replication, which will eliminate the need for traditional backup. The nightly backup will become a thing of the past, replaced by snapshots every 90 seconds. This approach will free up crucial primary storage budget, VMware/Hyper-V storage, and CPU/memory for critical servers.

    While data protection remains crucial, in the data recovery era, the sooner organizations adopt a restore and recover mentality, the better they will be able to benefit from successful business continuity strategies in 2020 and beyond.

    Author: Sean Derrington

    Source: Dataversity

  • Understanding and taking advantage of smart data distancing

    Understanding and taking advantage of smart data distancing

    The ongoing COVID-19 pandemic has made the term 'social distancing' a cynosure of our daily conversations. There have been guidelines issued, media campaigns run on prime time, hashtags created, and memes shared to highlight how social distancing can save lives. When you have young children talking about it, you know the message has cut across the cacophony! This might give data scientists a clue of what they can do to garner enterprise attention towards the importance of better data management.

    While many enterprises kickstart their data management projects with much fanfare, egregious data quality practices can hamper the effectiveness of these projects, leading to disastrous results. In a 2016 research study, IBM estimated that bad quality data costs the U.S. economy around $3.1 trillion dollars every year.

    And bad quality data affects the entire ecosystem; salespeople chase the wrong prospects, marketing campaigns do not reach the target segment, and delivery teams are busy cleaning up flawed projects. The good news is that it doesn’t have to be this way. The solution is 'smart data distancing'.

    What is smart data distancing?

    Smart data distancing is a crucial aspect of data danagement, more specifically, data governance for businesses to identify, create, maintain, and authenticate data assets to ensure they are devoid of data corruption or mishandling.

    The recent pandemic has forced governments and health experts to issue explicit guidelines on basic health etiquette; washing hands, using hand sanitizer, keeping social distance, etc. At times, even the most rudimentary facts need to be recapped multiple times so that they become accepted practices.

    Enterprises, too, should strongly emphasize the need for their data assets to be accountable, accurate, and consistent to reap the true benefits of data governance.

    The 7 do’s and don’ts of smart data distancing:

    1. Establish clear guidelines based on global best data management practices for the internal or external data lifecycle process. When accompanied by a good metadata management solution, which includes data profiling, classification, management, and organizing diverse enterprise data, this can vastly improve target marketing campaigns, customer service, and even new product development.

    2. Set up quarantine units for regular data cleansing or data scrubbing, matching, and standardization for all inbound and outbound data.

    3. Build centralized data asset management to optimize, refresh, and overcome data duplication issues for overall accuracy and consistency of data quality.

    4. Create data integrity standards using stringent constraint and trigger techniques. These techniques will impose restrictions against accidental damage to your data.

    5. Create periodic training programs for all data stakeholders on the right practices to gather and handle data assets and the need to maintain data accuracy and consistency. A data-driven culture will ensure the who, what, when, and where of your organization’s data and help bring transparency in complex processes.

    6. Don’t focus only on existing data that is readily available but also focus on the process of creating or capturing new and useful data. Responsive businesses create a successful data-driven culture that encompasses people, process, as well as technology.

    7. Don’t take your customer for granted. Always choose ethical data partners.

    How to navigate your way around third-party data

    The COVID-19 crisis has clearly highlighted how prevention is better than a cure. To this effect, the need to maintain safe and minimal human contact has been stressed immensely. Applying the same logic when enterprises rely on third-party data, the risks also increase manifold. Enterprises cannot ensure that a third-party data partner/vendor follows proper data quality processes and procedures.

    The questions that should keep your lights on at night are:

    • Will my third-party data partner disclose their data assessment and audit processes?
    • What are the risks involved, and how can they be best assessed, addressed, mitigated, and monitored?
    • Does my data partner have an adequate security response plan in case of a data breach?
    • Will a vendor agreement suffice in protecting my business interests?
    • Can an enterprise hold a third-party vendor accountable for data quality and data integrity lapses?  

    Smart data distancing for managing third-party data

    The third-party data risk landscape is complex. If the third-party’s data integrity is compromised, your organization stands to lose vital business data. However, here are a few steps you can take to protect your business:

    • Create a thorough information-sharing policy for protection against data leakage.
    • Streamline data dictionaries and metadata repositories to formulate a single cohesive data management policy that furthers the organization’s objectives.
    • Maintain quality of enterprise metadata to ensure its consistency across all organizational units to increase its trust value.
    • Integrate the linkage between business goals and the enterprise information running across the organization with the help of a robust metadata management system.
    • Schedule periodic training programs that emphasize the value of data integrity and its role in decision-making.

    The functional importance of a data steward in the data management and governance framework is often overlooked. The hallmark of a good data governance framework lies in how well the role of the data steward has been etched and fashioned within an organization. The data steward (or a custodian) determines the fitness levels of your data elements, the establishment of control, and the evaluation of vulnerabilities, and they remain on the frontline in managing any data breach. As a conduit between the IT and end-users, a data steward offers you a transparent overview of an organization’s critical data assets that can help you have nuanced conversations with your customers. 

    Unlock the benefits of smart data distancing

    Smart and unadulterated data is instrumental to the success of data governance. However, many enterprises often are content to just meet the bare minimum standards of compliance and regulation and tend to overlook the priority it deserves. Smart data means cleaner, high-quality data, which in turn means sharper analytics that directly translates to better decisions for better outcomes.

    Gartner says corporate data is valued at 20-25% of the enterprise value. Organizations should learn to monetize and use it wisely. Organizations can reap the benefits of the historical and current data that has been amassed over the years by harnessing and linking them to new business initiatives and projects. Data governance based on smart enterprise data will offer you the strategic competence to gain a competitive edge and improve operational efficiency.


    It is an accepted fact that an enterprise with poor data management will suffer an impact on its bottom line. Not having a properly defined data management framework can create regulatory compliance issues and impact business revenue.

    Enterprises are beginning to see the value of data in driving better outcomes and hence are rushing their efforts in setting up robust data governance initiatives. There are a lot of technology solutions and platforms available. Towards this endeavor, the first step for an enterprise is to develop a mindset of being data-driven and being receptive to a transformative culture.

     he objective is to ensure that the enterprise data serves the cross-functional business initiatives with insightful information, and for that to happen, the data needs to be accurate, meaningful, and trustworthy. Setting out to be a successful data-driven enterprise can be a daunting objective with a long transformational journey. Take a step in the right direction today with smart data distancing!

    Author: Sowmya Kandregula

    Source: Dataversity

  • What exactly is a Data Fabric? Definitions and uses

    What exactly is a Data Fabric? Definitions and uses

    A Data Fabric “Is a distributed Data Management platform whose objective is to combine various types of data storage, access, preparation, analytics, and security tools in a fully compliant manner to support seamless Data Management.” His concept has gained traction as technologies, such as the Internet of Things, need to have a consistent way of making data available to specific workloads or applications. It is key for retrieving data across multiple locations spanning the globe, since many companies use a variety of storage system configurations and cloud providers.

    Other definitions of a Data Fabric Include:

    • “A solution to the phenomenon where datasets get so large that they become physically impossible to move.” (Will Ochandarena)
    • “A comprehensive way to integrate all an organization’s data into a single, scalable platform.” (MAPR)
    •  “An enabler of frictionless access of data sharing in a distributed data environment.” (Gartner)
    • “An information network implemented on a grand scale across physical and virtual boundaries – focus on the data aspect of cloud computing as the unifying factor.” (Forbes)
    • A design allowing for “a single, consistent data management framework, allowing easier data access and sharing in a distributed environment” (TechRepublic)

    Businesses use a Data Fabric to:

    • Handle very large data sets across multiple locations quicker.
    • Make data more accessible
    • Optimize the entire data lifecycle– to enable applications that require real-time analytics.
    • Integrate data silos across an environment
    • Deliver a higher value from data assets
    • Allow machine learning and AI to work more efficiently

    Author: Michelle Knight

    Source: Dataversity

  • Why cloud solutions are the way to go when dealing with global data management

    Why cloud solutions are the way to go when dealing with global data management

    To manage geographically distributed data at scale worldwide, global organizations are turning to cloud and hybrid deployments.

    Enterprises that operate worldwide typically need to manage data both on the local level and globally across all geographies. Local business units and subsidiaries must address region-specific data standards, national regulations, accounting standards, unique customer requirements, and market drivers. At the same time, corporate headquarters must share data broadly and maintain a complete view of performance for the whole multinational enterprise.

    Furthermore, in many multinational firms, data is the business. In worldwide e-commerce, travel services, logistics, and international finance for example. So it behooves each company to have state-of-the-art data management to remain innovative and competitive. These same organizations must also govern data locally and globally to comply with many legislated regulations, privacy policies, security measures, and data standards. Hence, global businesses are facing a long list of new business and technical requirements for modern data management in multinational markets.

    For maximum business value, how do you manage and govern data that resides on multiple premises, clouds, applications, and data platforms (literally) worldwide? Global data management based on cloud and hybrid deployments is how.

    Defining global data management in the cloud

    The distinguishing characteristic of global data management is its ever-broadening scope, which has numerous drivers and consequences:

    Multiple physical premises, each with unique IT systems and data assets. Multinational firms consist of geographically dispersed departments, business units, and subsidiaries that may integrate data with clients and partners. All these entities and their applications generate and use data with varying degrees of data sharing.

    Multiple clouds and cloud-based tools or platforms. In recent years, organizations of all sizes have aggressively modernized and extended their IT portfolios of operational applications. Although on-premises applications will be with us into the foreseeable future, organizations increasingly prefer cloud-based applications, licensed and deployed on the software-as-a-service (SaaS) model. Similarly, when organizations develop their own applications (which is the preferred approach with data-driven use cases, such as data warehousing and analytics), the trend is away from on-premises computing platforms in favor of cloud-based ones from Amazon, Google, Microsoft, and others. Hybrid IT and data management environments result from the mix of systems and data that exist both on premises and in the cloud.

    Extremely diverse data with equally diverse management requirements. Data in global organizations is certainly big, but it is also diverse in terms of its schema, latencies, containers, and domains. The leading driver of data diversity is the arrival of new data sources, including SaaS applications, social media, the Internet of Things (IoT), and recently digitized business functions such as the online supply chain and marketing channels. On the one hand, data is diversifying. On the other hand, global organizations are also diversifying the use cases that demand large volumes of integrated and repurposed data, ranging from advanced analytics to real-time business management.

    Multiple platforms and tools to address diverse global data requirements. Given the diversity of data that global organizations manage, it is impossible to optimize one platform (or a short list of platforms) to meet all data requirements. Diverse data needs diverse data platforms. This is one reason global firms are leaders in adopting new computing platforms (clouds, on-premises clusters) and new data platforms (cloud DBMSs, Hadoop, NoSQL).

    The point of global data management in the cloud

    The right data is captured, stored, processed, and presented in the right way. An eclectic portfolio of data platforms and tools (managing extremely diverse data in support of diverse use cases) can lead to highly complex deployments where multiple platforms must interoperate at scale with high performance. Users embrace the complexity and succeed with it because the eclectic portfolio gives them numerous options for capturing, storing, processing, and presenting data in ways that a smaller and simpler portfolio cannot satisfy.

    Depend on the cloud to achieve the key goals of global data management. For example, global data can scale via unlimited cloud storage, which is a key data requirement for multinational firms and other very large organizations with terabyte- and petabyte-scale data assets. Similarly, clouds are known to assure high performance via elastic resource management; adopting a uniform cloud infrastructure worldwide can help create consistent performance for most users and applications across geographies. In addition, global organizations tell TDWI that they consider the cloud a 'neutral Switzerland' that sets proper expectations for shared data assets and open access. This, in turn, fosters the intraenterprise and interenterprise communication and collaboration that global organizations require for daily operations and innovation.

    Cloud has general benefits that contribute to global data management. Regardless of how global your organization is, it can benefit from the low administrative costs of a cloud platform due to the minimal system integration, capacity planning, and performance tweaking required of cloud deployments. Similarly, a cloud platform alleviates the need for capital spending, so up-front investments are not an impediment to entry. Furthermore, most public cloud providers have an established track record for security, data protection, and high availability as well as support for microservices and managed services.

    Strive to thrive, not merely survive. Let’s not forget the obvious. Where data exists, it must be managed properly in the context of specific business processes. In other words, global organizations have little choice but to step up to the scale, speed, diversity, complexity, and sophistication of global data management. Likewise, cloud is an obvious and viable platform for achieving these demanding goals. Even so, global data management should not be about merely surviving global data. It should also be about thriving as a global organization by leveraging global data for innovative use cases in analytics, operations, compliance, and communications across organizational boundaries.

    Author: Philip Russom

    Source: TDWI

  • Why data management is key to retailers in times of the pandemic

    Why data management is key to retailers in times of the pandemic

    Jamie Kiser, COO and CCO at Talend, explains why retailers, striving to ensure they’re not missing out on future opportunities, must leverage one thing: data. By utilizing customer intelligence and better data management, retailers can collect supply chain data in real-time, make better orders to suppliers based on customer intelligence.

    While major industries from tech to the public sector felt COVID’s pain points, not many felt them as acutely as retail. Retailers must now contend with everything from unreliable supply chains to a limit on the number of customers in-store at any given time, as consumer behavior shifted with social distancing guidelines and new needs.

    For example, e-commerce grew by 44% in 2020. As we begin recovering from the pandemic, retailers increasingly push to deliver their customers an in-store shopping experience as seamless as it is online. However, these new digital strategies rely on precise inventory management, which remains a pain point for many brick-and-mortar stores.

    For retailers to ensure they’re not missing out on future opportunities, they need to leverage one thing: data. By utilizing customer intelligence and better data management, retailers can collect supply chain data in real-time, make better orders to suppliers based on customer intelligence. Data will help retailers fully integrate their supply chain, customer, and in-store data to ensure they’re creating an in-store experience that’s competitive to shopping online and other new shopping behaviors.

    Eyes on the supply (chain)

    The pandemic has revealed the fragility of the supply chain. With unprecedented unpredictability in what products stores will and will not have access to and when retailers need to integrate real-time data management into their online operations. Investing in supply chain data strategies enables retailers to adapt and adjust to sudden breakdowns from their suppliers.

    Like Wayfair and Dick’s Sporting Goods, some companies leverage real-time inventory data to make their supply chain transparent to their customers, so they are always up to date on what is and isn’t on the shelves. Investing in data management tools to collect supply chain data in real-time empowers retailers to create better customer experiences and saves stores the estimated billions in lost sales to customers discovering their desired item is out of stock.

    However, supply chain erosion is not the only supply problem retailers will have to overcome. Some issues facing supply and inventory come from consumers, like last March’s run on toilet paper or the PS5 selling out before they even made it to shelves. Seemingly instantaneous changes in customer behavior can instantly impact what items retailers are prioritizing in orders from suppliers. But without understanding customer behavior, retailers can overcorrect and wind up with dead inventory on their hands.

    Customer analytics influence orders

    To avoid collecting dead inventory and ensure orders to suppliers are accurate to their customers’ desires, retailers need to integrate customer intelligence with their supply chain information systems. Combining this information empowers retailers to place orders from suppliers based on precise predictive models of customer behavior. This way, retailers will keep up with rapid consumer behavior changes and keep supply chains up to speed with the latest trends in brick and mortar shopping.

    For example, buy online, pick up in-store (BOPIS) shopping experiences have been a growing trend among retailers and consumers the past few years, and this trend shows no sign of slowing down. One survey found BOPIS sales grew by 208% in 2020. But BOPIS is not the only trend growing during the pandemic. It has also accelerated research online, purchase offline (ROPO). Finding success in both BOPIS and ROPO is entirely contingent on understanding what’s on the shelf, what suppliers bring in, and which items are unpopular and creating dead inventory. By collecting specific customer intelligence, such as the products customers are researching, retailers can build predictive models for when online research turns into an in-store sale.

    Leveraging customer intelligence not only helps brick and mortar retailers keep shelves stocked with the products their customers wish to purchase but can also be integrated with supply chain data to optimize operations. Investment in data management and integration can positively impact retailers’ profits by allowing them to make purchasing decisions from suppliers based on supplier circumstances and customer demand. Pooling data from both suppliers and customers into a single source of truth gives retailers the ability to operate under intelligent predictive business models. It also will prevent direct profit loss to competitors. Research shows an estimated $36.3 billion is lost to brick-and-mortar competition annually to customers purchasing elsewhere upon discovering their desired item is out of stock.

    An integrated approach to supply chain management

    Integrating real-time supply chain data with customer intelligence can prevent customer walkouts and increase profits by mitigating the risks to the supply chain created by the pandemic. However, when this external data combines with internal data — like sales, restocking times, demand surges, and more – brick and mortar retailers can position their supply chain to be in sync with the real-time shopping occurring online and in-store. Better business intelligence and supply chain data management empower retailers to offer customers a competitive experience to find online. Doing this requires a robust data management system and a business-wide data strategy that integrates data across all verticals.

    For example, European clothing retailer Tape à l’oeil, replaced an aging ERP system with a new SAP and Snowflake-based infrastructure to better capture digital traffic data as they made operations digital due to the pandemic. This addition to their existing platform allows Tape à l’oeil to capture customer feedback through surveys to capture satisfaction with a new collection. Now digital campaign results are easily retrieved from Facebook and Instagram to cross-reference them in Snowflake and share comprehensive reports with management. Making data the heart of their business strategy.

    This new data strategy has allowed Tape à l’oeil to find success during these tumultuous times by integrating customer data into predictive models to help them act faster and mitigate risks in their supply chain. Tape à l’oei’s CIO said that leveraging data has allowed them to improve operations overall and give them the “agility” to react to disruptions in the supply chain swiftly.

    The brick and mortar way forward

    A year into the pandemic and the retail industry remains in an ever precarious state. However, consumer trends show there are still growth opportunities for the brick-and-mortar stores prepared to meet customer demand.

    Making data management an integral part of retail operations will help companies meet the supply chain challenges presented by COVID-19 and empower them to keep their business growing.

    Author: Jamie Kiser

    Source: Talend

  • Will the battle on data between Business and IT be ended?


    Business users have growing customer expectations, changing market dynamics, increasing competition, and evolving regulatory conditions to deal with. These factors compound the pressure on business decision makers to act now. Unfortunately, they often can’t get the data they need when they need it.

    Research shows that business managers often have to make data-driven decisions within one day. However, the time to build a single report using traditional BI methods can take six weeks or longer and a typical business intelligence deployment can take up to 18 mobusiness and ITnths.

    On the IT side, teams are feeling the pressure. They have a long list of items to do for the short run and long run. Regarding data management, IT has to try to combine data from multiple sources, ensure that data is secure and accurate, and deliver the data to the business user as requested.

    Given the need for “data now,” in relation to the bandwidth concerns placed on IT, many organizations find that their enterprise lacks the skills, technology, and support to use their corporate data to keep up with competitors, customer needs, and the marketplace.

    Adding to this existing challenge is the notion that companies are continuously adding new data sources, but each new data integration can take weeks or even months. By the time the work is complete, it’s likely that a newer, better source has already taken its place.

    Automation is a force that is driving change throughout the entire BI stack. Just look at the proliferation of self-service data visualization tools. But self-service analytics can quickly go awry without adequate governance.

    Companies that can integrate self-service BI and still maintain governance, security, and data quality will empower business users to make decisions on-demand, while relieving IT from these internal stakeholder pressures.

    Having the ability to store data in a place or a hub where it can be cleansed, reconciled, and made available as a consistent resource, on demand resource to business users can help solve the issue.

    When quality issues arise, or bad data is found, the error can be corrected once in the hub for all users – resulting in one single source of the truth. It is a place where data quality and consistency are maintained. This central repository enables the right person to have access to the right data at the right time.

    Business executives, managers, and frontline users in operations want the power to move beyond the limits of spreadsheets so that they can engage in deeper analysis by leveraging data insights to strengthen all types of decision needs. Today, newer tools and methods are making it possible for organizations to meet the demands of nontechnical users by enabling them to access, integrate, transform, and visualize data without traditional IT handholding.

    The age of self-service demands that business users have full and flexible access to their data. It also demands that business users be the ones who determine that data should be included in the system. And while business users need the expert help of IT to ensure the quality, consistency, and contextual validity of the data, business and IT can now work together more closely and more easily than ever before.

    Organizations can effectively “democratize” data by addressing the needs of nontechnical users including business executives, managers, and frontline users. This can transpire If they grant more power to those users, not just in terms of access and discovery, but also in terms of sourcing what goes into a central hub.

    In the end, giving more power to the people is one surefire way to help end the battle between business and IT.

    Author: Heine Krog Iversen

    source: Information management

EasyTagCloud v2.8