91 items tagged "data"

  • ‘World Situation Room’ visualiseert wereldwijde impact coronavirus

    ‘World Situation Room’ visualiseert wereldwijde impact coronavirus

    Toucan Toco en CashStory hebben hun expertise ingezet om data over de coronacrisis door middel van datavisualisatie beter leesbaar en voor iedereen toegankelijk te maken. De applicatie World Situation Room toont nauwkeurige en relevante informatie over de impact van het Covid-19 virus op de wereldwijde gezondheid en economie in een interactief dashboard.

    World Situation Room visualiseert updates over de wereldwijde verspreiding van de Covid-19 pandemie, met onder andere het aantal bevestigde besmettingen en genezen verklaarde patiënten. Door deze gegevens in een interactief dashboard weer te geven worden trends inzichtelijk, net als een ranking van de verschillende landen waar besmettingen zijn vastgesteld. 

    Daarnaast toont het dashboard informatie over de wereldwijde economische impact van de coronacrisis, met data over aandelen, grondstoffen, valuta’s en algemene economische indicatoren.

    Gebruikers van het interactieve dashboard kunnen de informatie naar wens op globaal of nationaal niveau bekijken. 

    De applicatie is een bètaversie, er wordt gewerkt aan het toevoegen van onder andere rentetarieven, logistiek en supply chain updates en het tonen van financiële concessies verleend door regeringen ter ondersteuning van handel en industrie. 

    “In een tijd van ongekende onzekerheid en volatiliteit, is het onze missie om één bron te bieden van nauwkeurige en up-to-date informatie over gezondheids- en economische ontwikkelingen,” aldus Baptiste Jourdan, oprichter van Toucan Toco. 

    Bron: Toucan Toco

  • 4 Tips to help maximize the value of your data

    4 Tips to help maximize the value of your data

    Summer’s lease hath all too short a date.

    It always seems to pass by in the blink of an eye, and this year was no exception. Though I am excited for cooler temperatures and the prismatic colors of New England in the fall, I am sorry to see summer come to an end. The end of summer also means that kids are back in school, reunited with their friends and equipped with a bevy of new supplies for the new year. Our kids have the tools and supplies they need for success, why shouldn’t your business?

    This month’s Insights Beat focuses on additions to our team’s ever-growing body of research on new and emerging data and analytics technologies that help companies maximize the value of their data.

    Get real about real time

    Noel Yuhanna and Mike Gualtieri published a Now Tech article on translytical data platforms. Since we first introduced the term a few years ago, translytical data platforms have been a scorching hot topic in database technology. Enabling real-time insights is imperative in the age of the customer, and there are a number of vendors who can help you streamline your data management. Check out their new report for an overview of 18 key firms operating in this space, and look for a soon-to-be-published Forrester Wave™ evaluation in this space, as well.

    Don’t turn a blind eye to computer vision

    Interested in uncovering data insights from visual assets? Look no further than computer vision. While this technology has existed in one form or another for many years, development in convolutional neural networks reinvigorated computer vision R&D (and indeed established computer vision as the pseudo-progenitor of many exciting new AI technologies). Don’t turn a blind eye to computer vision just because you think it doesn’t apply to your business. Computer vision already has a proven track record for a wide variety of use cases. Kjell Carlsson published a New Tech report to help companies parse a diverse landscape of vendors and realize their (computer) vision.

    Humanize B2B with AI

    AI now touches on virtually all aspects of business. As techniques grow more and more sophisticated, so too do its use cases. Allison Snow explains how B2B insights pros can leverage emerging AI technologies to drive empathy, engagement, and emotion. Check out the full trilogy of reports and overview now. 

    Drive data literacy with data leadership

    Of course, disruptive changes to data strategy can be a hard sell, especially when your organization lacks the structural forces to advocate for new ideas. Jennifer Belissent, in a recent blog, makes the case for why data leadership is crucial to driving better data literacy. Stay tuned for her full report on data literacy coming soon. More than just

    leadership, data and analytics initiatives require investment, commitment, and an acceptance of disruption. No initiative will be perfect from the get-go, and it’s important to remember that analytics initiatives don’t usually come with a magician’s reveal.

    Author: Srividya Sridharan

    Source: Forrester

  • 5 Guidelines that keep your business' analytics app working optimally

    5 Guidelines that keep your business' analytics app working optimally

    One of the key challenges faced by organizations deploying an enterprise-wide analytics solution is the maintenance and upgrade of its applications. Most organizations follow an agile development methodology that entails frequent releases with new content as well as routine upgrades, patches, fixes, security updates etc.

    Depending on the complexity of the application, you need to invest a significant amount of time, energy, and manpower to ensure that none of the existing reports, dashboards, or underlying data is adversely impacted by any of these maintenance tasks. Any degradation in performance or accuracy of data in these applications may not only reflect poorly on the system administrators, but it may also lead to a lower level of reliability in the analytics solution and ultimately impact user adoption and business value throughout the organization negatively.

    Hence, it is critical for system administrators to ensure that the application and the data within it remains consistent and reliable for its end users, irrespective of the ongoing maintenance tasks that they have to perform on the system.

    A typical testing methodology adopted by most organizations involves manual testing and 'eye-balling' of a subset of reports and data after major maintenance tasks such as patches and updates. Organizations with more resources may create custom test scripts and automate certain parts of the testing and QA process.

    Upgrades are typically more involved and take a lot more time and testing to ensure consistency. When your analytics application grows to thousands of users and tens of thousands of reports and dashboards, it is usually cost prohibitive to test every single report for every user. Hence, automation of this testing process is critical to the long-term success of an analytics application.

    Here are five things to keep in mind when automating testing of analytics applications:

    1. Real-world applications

    Make sure that tests are run on as many real-world production applications as possible. Testing on just one or a handful of sample environments is not ideal and can lead to unforeseen issues when deploying the update or an upgrade. The applications on which tests are run need to be representative of real-world applications that your users or customers will be using.

    2. Replica of live production system:

    Ensure that there is no impact to the actual live production system at the time of testing. To run a series of tests at any time of the day, you need a replica of the production system with the same hardware and software, but in an isolated environment that is as similar to the production system as possible. This way, as your users report new issues, you can analyze them and assess their impact by running tests in a separate environment so system performance for users is not affected by the ongoing testing. Using a cloud platform makes it easier to quickly provision a replicated environment for testing purposes.

    3. Platform approach to testing

    It is really important to design the automated testing system as a platform for running a variety of tests, rather than creating disjointed automation scripts for different scenarios. The testing process also needs to incorporate changes when it fails to identify certain issues. With a single platform, you can achieve economies of scale and optimize and share the testing infrastructure across multiple scenarios and applications.

    4. Granularity of test execution data

    Test results should not be simply binary in terms of pass or fail. Irrespective of whether an application passes or fails a particular test, it is important to capture detailed statistics and information from every step of the testing process. This will help you identify and anticipate future issues and fine tune the testing process.

    5. Analysis of test results 

    Depending on the complexity of the testing process, the analysis of test results can be a full-fledged analytics application in itself. The test results should be stored in an optimized format (for example, in a data warehouse) that makes it easy to analyze in detail to gain further insights into the application performance. This will also help analyze historical system test results and monitor the performance over a period of time.


    With the ever-increasing importance of analytics and the the use of mobile devices at an all-time high, an optimally functioning analytics app can be of value for any business. These apps should be unaffected by necessary processes like updates, testing, and maintenance in order to keep working optimally.

    That's why it's crucial that your business aways keeps the guidelines mentioned above in mind. Keep improving your applications, especially the ones connected to your business' analytics solution, but never let these improvements affect the use of the app negatively!

    Author: Pradyut Bafna

    Source: MicroStrategy

  • 5 mistakes to watch out for when using data analytics in marketing

    5 mistakes to watch out for when using data analytics in marketing

    If marketing were an apple pie, data would be the apples — without data supporting your marketing program, it might look good from the outside, but inside it’s hollow. In a recent survey from Villanova University, 100% of marketers said data analytics has an essential role in marketing’s future. With everyone on board with the importance of data analytics, it’s surprising that as of 2020, only 52.7% of marketers were actually using analytics in their marketing efforts (according to Marketing Evolution), and only 9% of marketers polled by Gartner’s Marketing Data and Analytics Survey said their company has a strong understanding of how to effectively use data analytics. 

    This illuminates a disconnect: Marketers understand data’s significance, but they don’t know how to use it to best serve their business objectives. Becoming a data-minded marketer is a process, and the stats clearly show a large number of marketers are still engaged in that process. With a little guidance, you can avoid some common marketing analytics mistakes to make your data journey smoother. 

    Mistake #1: Bringing data into marketing too late

    That feeling of figuring out the perfect anecdote for an ad, tagline for a brand, or video concept for a new product launch is what keeps so many marketers going. That’s why it’s understandable that having to nix a brilliant idea because it doesn’t align with your data is painful — it’s also often necessary; you must learn to kill your darlings. This is a place where data can help marketers make smarter decisions, versus going with gut instinct.

    When you choose to ignore the directions data dictates, you could be choosing to ignore the main objective of marketing: to connect with your audience and inspire users to take action. 

    “There’s this tendency to build an idea and assume it’s going to be effective, but what marketers need to do is take a step back and let data inform development,” says Deven Wisner, Managing Partner of Viable Insights, a data evaluation firm that specializes in helping business build stronger data foundations to improve organizational success. “We need to be willing to reflect along the way, pivot when needed, and be agile — data allows us to do this, but we need to build the mentality of leading with data.” 

    This mindset is still something many marketers need to adopt. In a recent Gartner survey, 32% of marketers said that data conflicting with their intended course of action was ignored. However, as Deven states, avoiding data insights and going with your gut is like choosing all the wrong answers on a test despite your professor giving you the right ones. Data can’t help your marketing efforts if you won’t let it. 

    Bonus tip from the mind of a marketer: 

    “Data and creative should live together,” says Partnership Director of Kicks Digital Marketing Brooke Heffernan. “When you discover data that means something, you need to be agile enough to make experimental changes.” 

    However, Brooke cautions marketers to still put creative efforts at the forefront of business objectives and have them supported by data, not controlled by it. 

    “Data doesn’t need to define creative; however it is so insightful to target where you market and give you a deep understanding of your audience and how they behave,” Brooke continues. “If you know who your audience is, where they are, and what they care about, you’ve solved half your equation. Data makes a big difference here.”

    Key takeaways: 

    1. Pull data before beginning a major marketing project 
    2. Use your findings to define your key audiences and their behaviors 
    3. Use data to build creative ideas to reach your audience 

    Mistake #2: Choosing the wrong data visualization to present your data

    Data visualizations are graphic representations of data. Instead of seeing rows upon rows of data and trying to discern a meaning, visualizations display essential information in the form of charts, graphs, line plots, scatter plots, word clouds, infographics — anything that visually tells the story of your data. 

    These visuals are often used to build dashboards, which allow users to view crucial data all in one place, and they are customizable based on each user and their needs. The trouble is, marketers aren’t necessarily data experts, and without background knowledge, choosing the right data visualizations to represent your data can be overwhelming. 

    “A lot of people want to pick the sexiest or coolest visual, but that’s where we end up completely misrepresenting our data,” Deven says. “Sometimes we pick the chart before we know our data. We must first know the data to find the visualization that fits.” 

    By diving into your data before you have your heart set on a visual, you can effectively determine which graphic will tell your data’s story best. For example, if you are looking to discover your most profitable months in the last year, a scatter plot would likely be a confusing choice. A bar chart organized from lowest to highest profitability would be a better option as it would allow users to see insights with ease. 

    Work with your org’s in-house data experts to build analytics into your workflows that will make sense to all your users and drive meaningful decisions.

    Mistake #3: Making vanity metrics your main event

    Every marketer knows that your energy, creativity, and sometimes sanity go into making captivating work. There’s nothing better than learning a video went viral or a social campaign generated thousands of likes, follows, and comments … or is there?

    It’s easier now than ever to lose sight of what’s truly important in marketing: generating leads that turn into conversions, turning conversions into loyal customers, and evolving loyal customers into brand advocates. When the primary focus becomes likes, retweets, follows, and comments — also known as vanity metrics — marketing efforts become less meaningful for your long-term goals. 

    Vanity metrics ebb and flow like a tidal wave, meaning they can easily consume your marketing efforts. It’s crucial to balance the data from these metrics with that of engagement metrics (the data that ignites customer action). Think of vanity metrics as a supporting layer that is peeled back to reveal core metrics, such as brand interaction, lead generation, and conversions. 

    “Vanity metrics aren’t useless. They can be important for brand awareness, but they don’t necessarily equate to sales,” Brooke says. “Vanity metrics should be your keyboardist, not your lead vocalist.” 

    Mistake #4: Relying too heavily on data

    Record scratch — yes, there is such a thing as putting too much emphasis on your data and not enough on your people. Marketing is a human-centered industry often fueled by emotion and grown by inspiration. Taking people out of the decision-making equation is a fast way to lose your impact and actually dilute your data. 

    “Too often we expect data to give us all the answers, but in reality, humans are a really important part of decision-making,” Deven says. “Without people, data is just information floating around.” 

    When presented with data, it’s you and your team who will make impactful work out of rows of information and visualizations. Data doesn’t have the ability to build copy, well-designed campaigns, ads, and videos — your people do. When we use data as a fundamental part of building meaningful work, that’s where we find value.

    “Your people, your process, your performance always matter,” Brooke says. “Data is a piece of the puzzle and a valuable tool if it’s used in order of priority within marketing, but you can’t forget the human factor.” 

    Mistake #5: Failing to build a data-focused culture

    “You don’t know what you don’t know — not everyone is a data expert,” Deven says. But not being a data analytics guru doesn’t mean there isn’t room to become a novice, intermediate, and eventually knowledgeable resource. 

    “A marketer may not be an expert at data analytics, but what you need to think about is, How can you become even 10% better at it?” Deven says. “The answer is you need to build a data-focused culture.” 

    This doesn’t happen overnight or even within a few months. True data culture must involve continued education, efforts, and engagement from marketing leadership and senior management all the way through to junior employees. If everyone accepts and embraces data analytics as a cornerstone of marketing, that is when culture begins to emerge. 

    This process can be difficult, especially when the topic of data has become as trendy as customized face masks and as popular as Marvel movies. Like anything in marketing, doing something for the sake of doing it will never lead to success. This mentality needs to be implemented when building data culture as well. Don’t just do it because data is a buzzword; do it for the extended health of your marketing efforts and business. 

    “People collect data all the time without knowing how to use it,” Deven says. “It’s essential to shift the focus away from data being this cool thing and refocus on how we can embed data into organizations for a true organizational shift.” 

    As a digital marketer, Brooke describes the process of building a data culture as self-awareness for your agency. 

    “You can learn to love your body and it still be a work in progress, and that’s how it should be creating a data-minded marketing culture — you need to appreciate where you are, what you’ve learned, and create realistic next steps to achieve your goals.” 

    No one has to become a bona fide data analyst for this culture to thrive, but there does need to be a foundation of basic principles that marketers must appreciate so data can become a priority. Once this basis for analytics appreciation has been seeded, infusing analytics into workflows will take your organization the rest of the way to becoming truly data-driven, putting the right piece of actionable intelligence in front of the right person at the right time, in the right way for them to make the best decision possible. 

    “There are some really powerful insights out there,” Brooke says. “And it’s never a bad time to start becoming aware.” 

    Author: Madi Szrom

    Source: Sisense

  • 5 requirements for modern financial reporting  

    5 requirements for modern financial reporting

    How much time does your finance team spend collecting, sifting, and analyzing data?

    If you said “too much time,” you’re right. According to a Deloitte report, finance teams spend 48% of their time creating and updating reports. And when they’re operating at such a tactical level, it can be hard for them to see the forest for the trees.

    Without a modern approach to financial reporting, finance teams are so bogged down in the details that they simply don’t have the time to uncover insights in the data 一 insights that could be vital to your business.

    So how do you help them? In this piece, we’ll highlight five things you need to strengthen your financial reporting and be strategic in the data decade and how you can get them.

    1. Accountability and dynamic reports

    Finance teams have a lot riding on their shoulders. They’re responsible for reporting on business performance, something leadership teams and customers care deeply about. But business stakeholders don’t just want to be told what they want to hear. They want to know what’s really going on at the company.

    What’s happening in sales, product, marketing, customer success and how does their progress (or lack thereof) contribute to the whole? How could these groups optimize to get the most return? Extended Planning and Analysis (xP&A), or the concept of breaking down siloes and reporting across the organization, is what the future holds. But the finance department needs to change today.

    Finance teams need to be able to detect and help mitigate risk in all areas of the business. But in this day and age, there is so much noise that it’s hard to know whether inconsistencies are simply a result of bad data or if they truly represent an underlying issue that the company needs to fix. Worse, many of the reports finance teams run are in spreadsheets, which are prone to error and only show what's happening at a singlepoint in time. 

    To hold themselves and their business partners accountable, finance teams need accurate, useful financial reporting 一 they need dynamic reports. BI platforms enable finance teams’ accountability by monitoring performance, identifying trends, and determining profitability at any given moment.

    2. Transparency in business intelligence

    What was one of the most important things you learned back in high school math? Showing your work. It’s no different for finance teams, they just have to show their work on a much broader, higher stakes scale. 

    Proving that they collected the right data, used the right transformations, and performed the right analysis is finance table stakes for a company of any size. But because data is constantly growing and changing, even the basics are becoming difficult to substantiate, and will only become more difficult over time. To provide the transparency that internal and external stakeholders desire, companies need to bring their data under control. 

    Modern cloud-based solutions can integrate directly with ERPs and other accounting systems to make it abundantly clear where financial information is coming from. And the financial dashboards, budgeting tools, and forecast modeling that result show exactly what that data means for the company.

    3. Trustworthy KPIs

    It’s one thing to have a lot of data, but it’s another to actually trust those numbers. Unfortunately, most businesses, even (and perhaps especially) small ones, house their data in disparate databases, a recipe for fragmented, duplicative, and inaccurate analysis. When companies operate in this fashion, it’s no wonder stakeholders have trouble trusting their insights.

    What organizations really need is a purpose-built financial planning and reporting solution to funnel data residing in various systems into one place where it is deduped, transformed, and otherwise made ready for analysis. With a standardized, trustworthy source of truth, everyone can work under the same assumptions and draw more accurate conclusions. A single source of truth also makes your KPIs a truer reflection of where your business stands at all times.

    4. Self-service reporting

    Your finance team is probably spending their days gathering all the information they need to create and run reports, leaving them very little time to focus on strategy. In fact, McKinsey finds that finance leaders only spend 19% more time on value-add activities than other organizations, but that’s more than anyone else in their department. So how can you enable FP&A teams to actually focus on the planning and analysis?

    The answer lies in self-service reporting. Many companies rely on IT to run reports, but that can take a long time and the reports are stagnant. But what if anyone could pull their own reports? They’d get the data they need without having to wait. And everyone would have more time to surface important insights and help the company be more strategic. A self-service financial reporting software evangelizes data analysis throughout an organization, making the whole company more data-driven, productive, and effective.

    5. Data exploration with self-service reporting

    In order for finance teams to get the most out of your data, they need to break out of siloed frameworks and change their perspectives. That means they need to step away from the same models they've been leveraging over and over.

    Thinking outside the box and collaborating with other teams can reveal nuggets of wisdom that otherwise would’ve been overlooked. Financial analysis platforms can help your teams slice and dice your data and visualize it in different ways, opening the door to more creative exploration and interpretation. And when those insights are readily available, finance can share them with other teams to create and sustain a competitive edge.

    Source: Phocas Software

  • 6 Questions to ask yourself when conducting market research

    6 Questions to ask yourself when conducting market research

    As market researchers, our goals and responsibilities are to deliver thoughtful, accurate, data-driven insights to our partners.

    To do this successfully, we should always ask ourselves these questions:

    1. Is this methodology right for the audience, topic, and objectives? 

    Sometimes the objectives drive the methodology, but sensitivity to the topic and the needs of the respondents should also be kept in mind. Many times, a mixed methodology (such as qualitative with a homework assignment like a diary or a video log) or a multiphase approach (with qual before or after quant, or a stakeholder workshop prior to the research) is what will work best.

    2. Am I being too narrow or making assumptions about my target population? 

    Do not limit the scope of the research based on assumptions, old data, or old processes. Surveying your current (perceived) target or biggest buying group doesn’t help to expand your market. Including nonusers in your sample can provide insights into the product’s limitations that you might have otherwise overlooked.

    3. Am I screening and including a representative set of respondents? 

    Ensuring that you are surveying the correct group is important. If the target population is unknown, use the initial screening completes to identify the target population at its population-appropriate levels, as well as to set and adjust quotas for your final data set. If you or your partner are not sure if more males than females use their product, loosely set quotas to allow for males and females to fall naturally when sending out nationally representative sample. This allows you to identify what the product category demographics look like out of a balanced, nationally representative sample. (Read more about survey sampling plan designs.)

    4. Is the questionnaire composed of clear, well-framed questions? 

    Overall, questions should be simple to read and understand. They should be concise enough to avoid confusion but with enough context to disallow multiple interpretations. Some questions may require a timeframe or other level-setting qualifier. Industry terms and acronyms should generally be avoided or else thoroughly defined. And be sure to ask a variety of question types – key metrics, diagnostics, and open-ended feedback (an open-end is often most helpful and important after a key metric, such as asking why the respondent would or would not be likely to buy).

    5. Is the research instrument as short and engaging as possible? 

    Research on research shows us that keeping surveys short and well-flowed will not only lead to better, cleaner data, but it will also help to maintain a positive relationship with respondents. While it seems harmless to add an extra question or two at the end of a 15- or 20-minute survey, stretching the survey length beyond 25 minutes fatigues respondents, possibly causing them to provide invalid or less thoughtful responses. (For instance, open-ended questions towards the beginning of the survey tend to garner longer and more interesting responses.) Asking the most important questions early is helpful, as is asking questions like grids that may be harder for a fatigued respondent to answer at the 20-minute mark. This is why easy questions like demographics are often last in a survey. Lastly, it is helpful to use a variety of question types instead of page after page of the same thing, especially when those are grids! (Read more about how you can maximize survey results with no grid questions.)

    6. Is the data I am collecting clean and reliable? 

    Every data set should be reviewed for outliers, bots, fraudulent respondents, speeders, and cheaters before the end of data collection (so those found can be replaced with valid respondents). Reviewing survey data and removing bad respondents (including those providing poor or nonresponsive open-ended answers) ensures that the insights are reliable and valid. (Read more about sampling methodology and the sampling industry landscape.)

    With years of experience, it can be easy to become engrained in our processes and begin to overlook the many details that ensure a successful market research project. This list just serves to remind us all to ask the important questions to ensure that every research initiative is efficient and effective in yielding reliable, high-quality insights to drive important business decisions.

    Authors: Sara Sutton & Stephanie Trevino

    Source: Deicision Analyst

  • 6 Ways to improve the intelligence of your business

    6 Ways to improve the intelligence of your business

    Business Intelligence (BI) once was a luxury used by corporations and enterprises who invested in a team of data scientists and IT specialists. Modern technology and software tools have made it possible for anyone to increase their BI value within their organization. Small and medium-sized businesses can use the same tools without investing a lot of time or money. Here are some ways to increase the value of your business intelligence.

    Spread it across your organization

    Where is Business Intelligence needed in your organization? Which part of your organization should you focus on? Is it something you should use for a single department or as a company-wide tool? These are common questions for businesses of all sizes.

    BI is mostly used by executives of enterprises and large corporations. In several others, it’s an important sales department tool. Companies who use BI aren’t getting the most out of their BI efforts. Business Intelligence is effective in any department within your organization if you allow it to be. The organizations who get the most out of their BI investments use it across their entire organization.

    Make it proactive

    Business Intelligence is just another form of data visualization. It helps you understand data and make better business decisions. It can become proactive once you understand your businesses’ triggers. Automated alerts incorporate the use of multilayer lineage that can inform you when unusual changes are taking place with your data?

    If you want to increase the value of your BI, then set up automated alerts. They can come in the form of e-mail or SMS alerts when your data hits a certain threshold for example. This turns your BI from a reactive tool into a proactive tool. Automated alerts help you address issues as they arise.

    Incorporate self-service options

    When it comes to traditional BI, users request the reports they need. Then they wait for their IT team to create those reports. Depending on the efficiency of the IT team, this entire process could take several weeks. This process can be frustrating for both the IT group and its users.

    IT’s workload is endless with the number of requests they receive in a given day. This leads to slow turnaround times and frustrated users. To increase the value of your BI investment, you need to eliminate the old reporting process. You should give users access to self-service tools, so they can access these reports immediately. This eliminates the need for bothering the IT, department who can then pay attention to more important things.

    Automate everything

    Business Intelligence depends on the data that supports it. BI that has outdated or inaccurate data is worse than having no BI. Most organizations don’t understand that they can’t replace BI software with their existing data and create reports right away. This data must be consolidated into one place and formatted specifically for the BI tool that’s used.

    Businesses turn to manual processes to meet this need. Not only does this waste time, but it leads to human errors. About 90% of spreadsheets contain data errors. To invest in your BI, you should automate your data with a data warehouse and ETL tool. This can significantly reduce errors and save time processing data.

    Automated processes can help businesses optimize customer engagement and marketing efforts. There are automated tools for content marketing,nline donations, social media management, etc. This can free up time so you can focus on effectively meeting your business goals.

    Extend BI across all your devices

    Business Intelligence needs to keep up with the demands of modern technology. It should provide the data you need across all of your devices. Many organizations still access their data on their desktop computers. Thanks to mobile technology, you can determine how or when users will need access to important data.

    They can access the data on their smartphone or access a secure dashboard on their tablet for example. Or they may have to use a desktop computer or laptop. You can never determine which device they’ll use to access this data. Thats why you should extend BI across all devices.

    If you want to raise your BI to the next level, it must be accessible everywhere. You should be able to adapt to the devices it’s used. One way to increase your Business Intelligence efforts is to offer a mobile option for users. It will allow your employees to stay informed at all times.

    Make use of external data

    Business Intelligence allows you to create and run reports, and gather your insights over internal data. It can be used to answer questions about your organization’s internal profit, productivity, revenue, and other important factors. Most businesses aren’t aware of the goldmine of information used in external sources. You have more access to data more than you know.


    How can Business Intelligence help your business achieve its goals? If you combine your external data with your internal data, you can open up to new possibilities for your entire organization. This can add more information to your existing and potential customers. It’s imperative for brick and mortar businesses to make use of this information to increase marketing opportunities and to expand their customer base.

    Author: Philip Piletic

    Source: SmartDataCollective

  • 7 Personality assets required to be successful in data and tech

    7 Personality assets required to be successful in data and tech

    If you look at many of the best-known visionaries, such as Richard Branson, Elon Musk, and Steve Jobs, there are certain traits that they all have which are necessary for being successful. So this got me thinking, what are the characteristics necessary for success in the tech industry? In this blog, I’m going to explain the seven personality traits that I decided are necessary for success, starting with:

    1. Analytical capabilities

    Technology is extremely complex. If you want to be successful, you should be able to cope with complexity. Complexity not only from technical questions, but also when it comes to applying technology in an efficient and productive way.

    2. Educational foundation

    Part of the point above is your educational foundation. I am not talking so much about specific technical expertise learned at school or university, but more the general basis for understanding certain theories and relations. The ability to learn and process new information very quickly is also important. We all know that we have to learn new things continuously.

    3. Passion

    One of the most important things in the secret sauce for success is being passionate about what you do. Passion is the key driver of human activity, and if you love what you’re doing, you’ll be able to move mountains and conquer the world. If you are not passionate about what you are doing, you are doing the wrong thing.

    4. Creativity

    People often believe that if you are just analytical and smart, you’ll automatically find a good solution. But in the world of technology, there is no one single, optimal, rational solution in most cases. Creating technology is a type of art, where you have to look for creative solutions, rather than having a genius idea. History teaches us that the best inventions are born out of creativity.

    5. Curiosity

    The best technology leaders never stop being curious like children. Preserving an open mind, challenging everything and keeping your curiosity for new stuff will facilitate your personal success in a constantly changing world.

    6. Persistence

    If you are passionate, smart and creative and find yourself digging deeply into a technological problem, then you’ll definitively need persistence. Keep being persistent to analyze your problem appropriately, to find your solution, and eventually to convince others to use it.

    7. Being a networker and team player

    If you have all the other skills, you might already be successful. But, the most important booster of your success is your personal skillset. Being a good networker and team player, and having the right people in your network to turn to for support, will make the whole journey factors easier. There might be successful mavericks, but the most successful people in technology have a great set of soft skills.

    As you’ll notice, these characteristics aren’t traits that you are necessarily born with. For those who find that these characteristics don’t come naturally to them, you’ll be pleased to hear that all can be learned and adopted through hard work and practice. Anyone can be successful in tech, and by keeping these traits in mind in future, you too can ensure a long and successful career in tech.

    Author: Mathias Golombek

    Source: Dataversity


  • A brief guide for those who consider a career in market intelligence

    A brief guide for those who consider a career in market intelligence

    Market research and insights careers are having a moment thanks to the proliferation of data across the business world. Here’s how to become a part of the community.

    Thanks to the proliferation of data across so many aspects of the business world, careers in insights, analytics, and marketing research are having a moment.

    “Data and analytics, generically speaking, are driving a big piece of how businesses are spending their time and money,” said Gregg Archibald, Managing Partner at Gen2 Advisors, on a recent GreenBook podcast. “If you are in the marketing research field, data and analytics, project management, whatever, you’ve got a job for a long time to come.”

    So let’s take a look at how you can get into the heat and curate a position in market research.

    What careers are in market research?

    A common position for newcomers to insights and analytics is market research analyst. Market research analysts typically curate and synthesize existing or secondary data, gather data from primary sources, and examine the results of data collection. Often they are tasked with communicating results to client stakeholders – externally or internally within their own organization. 

    At the entry level, you’ll find fieldwork and research directors on the supplier side. You might find specialists like UX and qualitative researchers working independently after they’ve paid their dues. And on the client side, key roles include managers of insights and analytics, or general corporate researchers. Market research analyst jobs might have different titles, but the basic premise is the same: collect and interpret qualitative or quantitative data.

    What’s the current outlook for insights careers?

    The U.S. Bureau of Labor Statistics suggests that the job outlook for market research analysts is growing faster than average, at a rate of 22%. Ironically, survey researchers are growing at a much slower rate (4%). Why? Well, I might speculate that it’s one thing to be able to develop and implement a survey instrument. It’s totally another to be able to analyze the results and make actionable recommendations. 

    According to the latest wave of the GRIT Report, after the lows of the pandemic, staff size increases are at an all-time high. This might be surprising, knowing that we are presently experiencing economic uncertainty.

    “While many venture-capital-backed companies are shedding people in anticipation of the upcoming recession,” explains Lenny Murphy, “other non-VC backed companies are actively hiring.” So consider targeting private, private equity-backed, or public companies in your search.

    GRIT data from this report is also telling us that among supplier segments, technology and data and analytics providers have the most staff size increases. While targeting vendors is a strategy many put on the back burner in pursuit of corporate, client-side researcher roles, it represents a clear path to entry in our industry.

    How do I start a career in market research?

    The career journeys of market researchers are as vast as they are many. I was hired as a Data Analyst at a full-service research firm while still in school. Within months, I lost my job to layoffs. I quickly was re-hired at a qualitative research consultancy as an Assistant Field Director. From there, I took deliberate steps to grow my experience, moving first from supportive roles to that of a researcher, then from consulting and into management positions. Other people might share with you that their careers were more happenstance – they fell into certain things or stayed in one role for the long haul.

    There are, however, a few things I’d recommend as you look to get started in a market research career.

    1. Consider your education:

    Though there are outliers in every industry, most people break into insights with a minimum of a bachelor’s degree. Some career paths, like mine, started with a major in marketing. Other insights professionals studied communications, social science, psychology, economics, and more increasingly, statistics, data and analytics specifically.

    Some companies, and higher-level positions, will require a master’s degree. Many key players in our industry have earned their MBAs; others have achieved their Master’s in Marketing Research. Some data and analytics experts come from advanced fields such as statistics and/or behavioral economics.

    Aside from the areas of concentration your studies will allow, there are the soft skills you develop in school that serve most people well. Some of what we learned are in demand according to the latest GRIT Report are people skills, technical/computer expertise, and innovation, problem-solving and critical thinking abilities. 

    2. Seek entry-level experience:

    Depending upon the position, insights jobs require expertise/experience in either qualitative or quantitative research methods. Analytical expertise is in demand, but so is basic business acumen and industry knowledge.

    Sales and/or business development skills are always in demand at research vendors. Taking one of those positions might give you the baseline knowledge of the marketplace that other candidates don’t have at an insights industry entry level. This insider knowledge of the data and analytics space you gain attending conferences and conversing with suppliers and buyers could set you apart.

    Finally, many research companies – from smaller platforms to larger insights consultancies – have growing content departments and a need for marketing expertise. 

    3. Switch from an adjacent field:

    If you peruse my LinkedIn feed, you might see qualitative researchers who started out as anthropologists or psychologists. You might learn about a UX researcher who has a PhD and started out in sensory science. You might discover a marketing intern turned research business CEO and founder.

    My point is, don’t look for the perfect start. Just start somewhere. There’s this great video online at Harvard Business Review by KeyAnna Schmiedl that talks to my favorite analogy for career development: There isn’t one particular linear path all market researchers travel. Instead, there’s a variety of routes up the equivalent of a rock climbing wall. Your journey might include a trip to the side or even back down a little as you make your way to the summit.

     Author: Karen Lynch

    Source: Greenbook Blog

  • Aligning your business with your data team

    Aligning your business with your data team

    It’s important for everyone at a company to have the data they need to make decisions. However, if they just work with their data team to retrieve specific metrics, they are missing out. Data teams can provide a lot more insights at a faster rate, but you will need to know how to work with them to make sure that everyone is set up for success. 

    Data teams can be thought of as experts at finding answers in data, but it’s important to understand how they do that. In order to get the most value out of your collaboration, you need to help them understand the questions that matter to you and your team and why those questions need to be answered. There are a lot of assumptions that get built into any analysis, so the more the data team knows about what you are looking for, the more knowledge they may find as they explore data to produce their analysis. Here are four tips to make more productive requests from members of your data team: 

    Approach data with an open mind

    It’s important to treat the data request process as an open-ended investigation, not a way to find data that proves a point. A lot of unexpected insights can be found along the way. Make your goal to ask questions and let your data team search for the answers. This approach will allow you to get the best insights, the type of unknowns that could change your decision for the better. If you put limitations on what you’re asking the data, you’ll end up putting limitations on the insights you can get out of your inquiry. 

    To really dig into this, think about how questions are answered scientifically. Scientists treat any bias as an opportunity for the insight to be compromised. For example, let’s say you are looking to improve customer satisfaction with your product. Requesting a list of customers with the highest and lowest NPS scores will give you a list of people who are happiest or most frustrated, but it is not going to let you know how to improve customer satisfaction. This request puts too much attention on the outliers in your customer base rather than identifying the key pain points. That’s part of the picture, but not all of it. If you’re trying to create a program that targets your goal, let your data team know the goal, give them a few optional starting points, and see what they come back with. They might surprise you with some sophisticated analysis that provides more insight and helps you launch a fantastic program. 

    Start with a conversation, not a checklist

    The single biggest mistake a line-of-business professional can make when requesting data is to present a data expert with a list of KPIs and tell the data team to just fill in the blanks. This approach misses so much of the value a data team can provide. Modern data teams have technology and abilities that allow them to go much further than just calculating numbers. They can guide analytical exploration through flexible, powerful tools to make sure you’re getting the most valuable insights out of your data.

    Instead of a list of metrics, think about starting your data request as a meeting. You can provide the business context needed and a list of questions that you want answered. You can even present some initial hypotheses about what those numbers may look like and why they might move in one direction or another. This is a great way to kick off the conversation with your data counterpart. From there, you can benefit from their experience with data to start generating new and more informed questions from their initial inquiries. The data team’s job is to get you information that helps you be more informed, so give them as much context as possible and let them work as a problem solver to find data-driven recommendations.

    Data should recommend actions, not just build KPIs reports

    A lot of standard business KPIs measure the results of company efforts: revenue, lead conversion, user count, NPS, etc. These are important statistics to measure, but the people tracking them should be very clear that these numbers track how the company is moving, not why it is moving that way. To make these data points actionable, you need to take analysis further. Knowing that your NPS is going up or down is useless if it doesn’t inform a customer team about the next step to take. 

    A good data team will map important KPIs to other data and find connections. They’ll comb through information to find the levers that are impacting those important KPIs the most, then make recommendations about how to achieve your goals. When you get a list of levers, make sure to understand the assumptions behind the recommendations and then take the right actions. You can always go back to those KPI reports to test if the levers are having the intended effect.

    Data requests are iterative, give the data person feedback

    Communication about data should not end when the data has been delivered to you. It’s important to dig into the analysis and see what you can learn. Instead of reporting that data or taking action on it right away, you should check with your dashboard creator to make sure that he or she can verify that you’re reading all of the data properly and that the next steps are clear. There are a lot of ways to misinterpret data, a good way to prevent mistakes is to continue communicating.

    Even if you’ve gotten the right takeaways from the data, it’s still good to consult with your dashboard creator and go over your interpretation of the information so they know how you read data. You may need a follow-up meeting to restart with the overall question you want to answer, then see what additional data needs to be collected or what modifications are needed to make the report or dashboard work best for your intended use-case.

    Author: Christine Quan

    Source: Sisense

  • An overview of the Chief Data Officer role, past and present

    An overview of the Chief Data Officer role, past and present

    The role of Chief Data Officers (CDOs) in 2020 is evolving as CDOs are having quite possibly their most important and challenging year in their nearly two-decade existence to meet their organizations’ best needs and data capabilities. 2020 has become the year that everyone became fully aware of data and its role in our lives.

    As COVID-19 rapidly spread into a pandemic, business models were turned on their heads, supply chains dried up, and consumer behavior drastically altered CDOs, largely the leader of an organization’s overall data strategy, needed to deliver new ways to harness data and deliver insights quickly to help the business adjust.

    In greater numbers than ever before, organizations worldwide are harnessing the data that flows into and out of their systems to accurately forecast sales, customer retention, new products, competitor performance, employee satisfaction, regulatory compliance, fraud reduction, and more. To put it simply: Data is a commodity that must be harnessed by your business, or you will be left behind.

    Thus, the CDO’s role has drastically evolved. No longer is the CDO only responsible for maintaining regulatory compliance and potentially increasing efficiencies – the CDO could be the most important member of the leadership team other than the CEO.


    The first CDO appointment occurred in 2002 when CapitalOne appointed Cathy Doss to the position. The role was created largely as a response to the passing of the Sarbanes-Oxley Act of 2002, which was created as a response to various financial scandals. This new regulation required far more data governance and retention than ever before. Despite the newfound need for the role, its growth was relatively flat until 2008-2010.

    As recently as 2012, only 12% of large companies had a CDO. However, as many organizations realized the important role data plays in their business, those numbers began to rise sharply. In fact, in 2018, organizations with an appointed CDO rose to nearly 70%, and Gartner estimates that by 2025, 90% of large organizations will have a CDO.

    Evolution of the role

    Initially, the CDO was created primarily in response to new federal finance laws, thus serving largely as a defensive role focusing on governance and regulation. However, as technology improvements in the form of hardware and software emerged, coupled with an expansion in data analytics, progressive executives noted the potential for offensive corporate data utilization. Soon these organizations were able to monetize the data they were already collecting in new efficiencies, productivity, and overall growth.

    For example, various departments’ data was previously siloed, meaning that product development data wasn’t necessarily available to customer support or marketing. Under the CDO leadership, this data now exists as a thread that weaves throughout the organization, connecting design engineers all the way through to the customer. The CDO now serves as the tip of the innovation spear and not simply as a data steward.

    Challenges and opportunities

    While the role of the CDO continues to evolve and serve their organizations in new ways, there remain challenges that must be addressed:

    • Who does the CDO report to? Ideally, the CDO will be equal on the executive team, but the organizational fit varies.
    • Stakeholder buy-in to both the usage of data and the role itself also varies greatly in different organizations.
    • Battles with the CIO. CIOs trend towards attempting to save money, while CDOs typically want to invest in new technologies.
    • High turnover. The average tenure of a CDO is two years. This may, however, come from CDOs going to where the grass is greener.
    • Clarity of mission. Only 28% of recent survey respondents say that the role is successful and established.
    • Data silos. Data is still extremely siloed and scattered in most organizations, and it can be difficult to bring it all together and, more importantly, understand it.

    But just as there are questions and challenges for CDOs, there are also opportunities the office of the CDO can now offer their organizations:

    • Revenue growth
    • Innovation
    • Fraud reduction
    • Enhanced data governance
    • Lower costs
    • Data literacy

    Changes for 2020

    Harnessing the power of data in digital transformation will be imperative for most organizations going forward. AI, ML, and data analytics are no longer buzz words only for tech and finance, and every successful organization will pivot towards viewing data as an asset. COVID-19 has challenged everyone in all walks of life. Companies that have embraced data have been able to analyze their help desk data, VPN information, and other portions of their computing environments to determine what remote work policies are working are which are not.

    In the healthcare industry, the CDO office has provided information on the availability of personal protection equipment (PPE), beds, and staff to ensure adequate treatment is available for COVID-19 patients. Additionally, grocery chains have been taxed as never before, and data models provide valuable information on supply and demand and frontline grocery workers’ health.

    The post-COVID-19 world will offer opportunities due to the lessons learned and data ingested during the pandemic, such as enhanced digitization of workflows, robust disaster recovery plans, investment opportunities, and more. If 2020 has shown us anything, it is in the power of responsible data collection and sharing. CDOs that leverage this new emphasis on data and invest in the future will steer their organizations to new heights while building robust plans for future opportunities as well as crises. This is a new era, an era where data is king, and the CDO will be a critical player in determining organizations’ success or failure.

    Author: John Morrell

    Source: Datameer

  • Approaching the Transformation phase in ELT processes

    Approaching the Transformation phase in ELT processes

    Much has been written about the shift from ETL to ELT and how ELT enables superior speed and agility for modern analytics. One important move to support this speed and agility is creating a workflow that enables data transformation to be exploratory and iterative. Defining an analysis requires an iterative loop of forming and testing these hypotheses via data transformation. Reducing the latency of that interactive loop is crucial to reducing the overall time it takes to build data pipelines.

    ELT achieves flexibility by enabling access to raw data instead of predefined subsets or aggregates. The process achieves speed by leveraging the processing power of Cloud Data Warehouses and Data Lakes. A simple, dominant pattern is emerging: move all of your data to cost effective cloud storage and leverage cloud compute for transformation of data prior to analysis.

    What this takes:

    1. Extract your data from source systems
    2. Load your data into the cloud platform
    3. Transform your data in the cloud!

    There are a few different approaches to doing the transformation work as part of the ELT process.

    Code only solutions

    Individuals and teams proficient in languages such as SQL or Python can write transformations that run directly against the cloud data warehouse or data lake. Tools such as DBT and Dataform provide infrastructure around code to help teams build more maintainable pipelines. Code gives its authors ultimate flexibility to build anything the underlying system supports. Additionally, there are large communities of Python and SQL developers as well as a wealth of examples, best practices, and forums to learn from. Of course, there are many individuals in organizations that do not know how to write code or simply prefer not to but still need to efficiently transform data.

    Visual only solutions

    Individuals and teams that prefer visual transformation can leverage visual tools to build their pipelines. To gain the benefits of ELT, visual tools increasingly execute these pipelines directly  in Cloud Data Warehouses and Lakes instead of in proprietary run times. Visual solutions appeal to a large community of data pipeline developers who need to produce data assets but don’t necessarily want to or know how to code. These solutions often provide more automated approaches to build pipelines, increasing efficiency for many use cases. However, visual only approaches can at times be not as flexible as coding in the underlying system: certain use cases are not performant enough or simply not possible in the tool.

    Visual + code solutions

    We believe increasingly that modern data transformation tools will support both approaches to enable a broader community to collaborate and scale how this work is done in organizations. Use a visual approach where it makes sense but also enable users to leverage code where it makes sense. This best of both worlds approach has two major benefits:

    1. Increased efficiency for individuals: While some have strong preferences for doing their work in code or in a visual tool, we find that many people just want to get the job done as efficiently and effectively as possible. Each approach has advantages and disadvantages – providing flexibility allows an individual to choose the right tool for the job.
    2. Collaboration across the organization: Organizations have some users who prefer to code and some prefer not to. Solutions providing support for both have the potential to enable collaboration across users with varied skill sets and use cases.

    Approach at Trifacta
    Moving forward, Trifacta is increasingly investing in the two areas to enable collaboration across teams and individuals working in both code + user interfaces:

    1. Running code within Trifacta: Most of Trifacta's customers primarily leverage it's visual interface to build transformations, but many of them also use the existing functionality for running SQL queries from directly within Trifacta, building pipelines with SQL and/or our internal DSL. Soon, Trifacta plans to support other languages such as Python.
    2. Generating code with Trifacta: The pipelines customers build in Trifacta are built on top of our internal transformation language. This language is then translated into code that can run across a variety of different platforms. Today Spark, Google Dataflow, Photon (Trifacta's engine for in-browser computation) are supported and common transformations pushed down into databases and cloud data warehouses like Snowflake, Big Query and Redshift. To date, this code is ran on behalf of the customers, but Trifacta received many requests to take the code that is generated and use that completely outside of Trifacta.  

    Author: Sean Kandel

    Source: Trifacta

  • Building a base for successful AI with your data strategy  

    Building a base for successful AI with your data strategy

    For many years, the underlying complexities of AI, paired with a dramatic portrayal in the media as an inevitable replacement for human jobs, created a daunting narrative that made AI difficult for most people to understand, let alone to widely adopt. Now, we’re at an exciting turning point with AI. We're beginning to understand and articulate where AI is useful for elevating people's creativity and augmenting decision making. As the pandemic has accelerated digital transformation, organizations are successfully deploying and scaling AI projects across more sophisticated, critical scenarios.

    “Eighteen months back, 85% of the enterprises that we surveyed were actually experimenting,” said Nitin Mittal, a principal at Deloitte, specializing in AI and strategic growth. “They had data science groups, they had an AI center of excellence, they had investments, they were developing proof of concepts—trying to figure out the art of the possible. After just 18 months, more than 40% of those organizations are starting to adopt AI at scale.”

    So what’s changed? Organizations are not just focused on the technology elements of AI—they’re taking a strategic, human-centric approach that balances machine intelligence with human expertise. They're creating environments of trust, where people not only have the right data for building useful models, but understand the capabilities and limitations of AI and its outputs. After all, we can't forget that the purpose of these technologies is to improve people’s data-driven decision making

    Let’s look at a few focus areas of a people-centric strategy to help you achieve trusted data and successful AI projects: your data architecture, the processes for managing governed data, and balancing the roles of people and machines.

    Lay a strong foundation with your data architecture

    “I think one of the most important things I see people do right, is to make sure that you build the data foundation from the ground up correctly,” said Ali Ghodsi, CEO of Databricks. This starts with internal alignment—both organizationally and in terms of use cases and common goals.

    Your analysts, data scientists, data engineers, and machine learning engineers will offer unique viewpoints and preferences, and should all be brought into the conversation as experts in their areas of the business. A chief data officer and center of excellence will help in establishing ownership of the data and AI strategy, so as not to throw funding at uncoordinated or siloed efforts.

    Ali adds, “There’s a lot of stuff you have to do under the hood that you’ve got to get right—that your models are stored in the correct way, that you can reproduce your machine learning models, that you’re handling privacy in the right way—so it’s really important that you have an architecture that's built for AI from the ground up.” 

    The data lakehouse is one such architecture—with “lake” from data lake and “house” from data warehouse. This modern, cloud-based data stack enables you to have all your data in one place while unlocking both backward-looking, historical analysis as well as forward-looking scenario planning and predictive analysis. This kind of technology investment enables a broader set of data users to get value from your data within one platform—from business users doing traditional reporting and real-time analysis to data scientists working with algorithms for personalization, automation, and forecasting demand.

    Invest in strong data management and governance up front—it pays off downstream

    “Preparing data, classifying data, and tagging data so that it is machine consumable—as opposed to human consumable—is another artifact of organizations that have cracked the code,” Nitin shared of Deloitte’s findings. As data is the foundation of an AI system, the quality and reliability of AI-enabled prescriptive recommendations or automations are directly correlated to the quality and reliability of the data used to train the system. Trustworthy data can also remove a potential layer of complexity when validating that algorithms and their recommendations are transparent, ethical, and accurate.

    Organizations that lack sound data management practices or that have struggled to build traction and confidence in their data and analytics deployment stand little chance of successfully embracing AI. The same principles that have made self-service analytics possible at scale—a people-first approach to your data strategy, including a governance framework that ensures trust in data—are critical in the success of AI projects.

    “Our data strategy assures convergence on what we call the ‘golden rules’ in our decision supply chain,” said Una Shortt, VP of Data Platform and Engineering at Schneider Electric. “These rules are globally shared and they're published for everybody to access through a defined data catalog.” Setting this standard ensures that every team's systems use data from company-governed authoritative sources, including harmonized data and data objects built for reuse across analytics projects. 

    With these golden rules, data is everyone's business at Schneider Electric—not just an IT process. The upfront work to prep, clean, and govern the data means there are fewer downstream issues with quality and trust. “A shared data governance framework assures our users that if they're reusing a published, certified data object in our company, their colleagues are adopting this exact same view of the data,” shared Una. “We understand that when data is reliable, well structured, reusable, and trustworthy, it becomes a key accelerator for our business.”

    Harmonize human and machine capabilities to build trust in AI

    While machines are great at analyzing huge amounts of data (and getting better at finding hidden correlations in limited data sets), we still rely on people for long-term planning, abstract and creative thinking, and discerning causality from correlation. For this reason, AI projects need the domain experience of people who know your business and who can frame the data and AI results in the right context. By design, tools should explain the provenance of their recommendations to people, so they can make informed decisions—no one should passively accept AI recommendations.

    As Deloitte’s Nitin puts it, “Humans are working with and interacting with intelligent machines, so this notion of having human-centered AI experiences is absolutely critical.” AI offers incredible promise when we can balance its strengths with the creativity of human expertise. In short, people should always be in control of their relationship with AI.

    At Tableau and Salesforce, we believe that a people-first approach to your data strategy—combining the right architecture, tools, and processes—will help everyone trust the data and the AI applications it powers. Then, you’ll have a solid foundation for everyone to better understand and embrace AI in their decision making.

    Author: Vidya Setlur

    Source: Tableau

  • Chinachem: successful use of SAP software in the Hong Kong property market

    Chinachem: successful use of SAP software in the Hong Kong property market

    According to a January story in the South China Morning Post, Hong Kong has topped the table as the world’s most expensive housing market for the 9th straight year. That sounds like good news for property developers in that area. But, according to the same story, prices of Hong Kong homes also decreased with 7.2% in the last four months.

    What the news really shows is that the property market can be volatile. Combined with long construction times running into multiple years and billion dollar capital investments, that makes property development an extremely challenging industry.

    Few of Hong Kong’s developers are more aware of that than the Chinachem Group. While Chinachem began its life in agricultural projects and chemicals, Chinachem has developed its presence as one of Hong Kong’s most famous property companies over the year. Tthrough prosperous times and through tough times. Recently Chinachem was able to win a big land parcel in one of Hong Kong’s upmarket suburbs after surveyors cut their valuation by 10 per cent, another sign of softening property prices.

    However, in an industry that is often very traditional in its execution, it is not just prices that are putting property businesses under increasing competitive pressure. The digital explosion is also having a huge effect. As Chinachem’s Executive Director and Chief Executive Officer, Donald Choi, points out: technology is changing how companies in every industry are organized and run. And Chinachem isn’t any different.

    Changing times

    Hong Kong has been lucky in a way, especially in the property market, which has been a long-term growth market. But complacency can be a killer.

    Chinachem’s Head of Group IT, Joseph Cagliarini, believes that the lesson to be learned from a truly global brand like Kodak, which went bankrupt because the world changed from film to digital photography, cannot be overlooked. Instead, he calls for a relentless pursuit of technology to make sure Chinachem is not only listening to its customers, but able to respond appropriately.

    Different companies are at different stages of embracing digital transformation and technology. Anticipating what is required and strategizing change, Chinachem has turned its eyes to a new generation of milestones, and embarked on a journey to become an intelligent business.

    For the long-time property developer, that change starts with (real-time) data. Like many companies, Chinachem didn’t have a central view of its operations. So, all of its business units operated autonomously to some extent. That created a mountain of manual processes, and many separate systems containing valuable information.

    In October 2018, Chinachem selected a comprehensive suite of SAP software and cloud solutions to drive operational efficiency across finance and HR operations for its corporate group and hotels in order to help drive long-term efficiencies and growth. SAP is also providing Chinachem to help drive rapid innovation and increase the strategic value of human resources.

    Once the solutions are fully implemented, Chinachem will enjoy a variety of benefits, including real-time updates on financial performance that will optimize their finance processes. This includes everything from planning and analysis to period end close and treasury management.

    Long-term plans

    Thanks to other key features the group’s long-term objectives, such as enhancing financial planning and analysis, accelerating planning cycles, increasing profitability, and making finance functions more efficient are also supported. Chinachem is now able to accelerate the building and deployment of apps and extensions that engage employees in new ways. This will allow HR to be flexible and innovative without compromising the organization’s core HR process.

    In addition, Chinachem’s hotels can personalize their end-to-end HR landscape, creating an outstanding, seamless and secure user experience. The group can also leverage data from SAP solutions to make insightful business decisions that will have lasting impact.

    Customers are still king

    Chinachem’s journey also involves adapting to changing customers who now live on multiple platforms, both online and offline.

    With the right technology and software, Chinachem will be able to monitor customer behavior and, therefore respond to their needs without actually being asked. Executive Director of Chinachem, Donald Choi, believes that advanced data analytics could be the key to this. Not to replace offline experiences, but to be at all the right places at the right time.

    In an ever-changing and increasingly digital world, a comprehensive suite of SAP software and cloud solutions may not be the final answer for all of Chinachem’s needs. However, as Donald Choi says, “it is a good starting point for this journey.”

    Author: Terence Leung

    Source: SAP

  • Context is key for organizations making data-driven decisions

    Context is key for organizations making data-driven decisions

    As organizations enter a new year, leaders across industries are increasingly collecting more data to drive innovative growth strategies. Yet to move forward effectively, these organizations need greater context around their data to make accurate and streamlined decisions.

    A recent Data in Context research study found that more than 95% of organizations suffer from a data decision gap, which is the inability to bring together internal and external data for effective decision-making. This gap imposes a number of challenges on organizations, including regulatory scrutiny and compliance issues, missed customer experience opportunities, employee retention problems, and resource drainage due to increased manual data workload.

    While the influx of data is endless, organizations that fail to obtain a holistic, contextual view of complete datasets remain at risk for ineffective decision-making and financial waste. However, with the proper systems and technologies in place, companies can overcome the data decision gap to foster success in 2022.

    Siloed Systems Create Fragmented Data

    Fragmented data and disorganized internal systems have plagued companies for years, making it difficult for organizations to harness the full potential of their data due to a lack of context. Information technology has also drastically evolved, presenting companies with hundreds of different applications to choose from for storing data. However, this range of multiple siloed systems can create disparities in data.

    For example, financial services organizations might utilize different systems for each of the products they offer to customers and those systems might not be joined together on the back end. When trying to make informed decisions about a given customer, financial services professionals will need to consider all the available data on that customer to take the right course of action – but they can do so only if they are able to look at that data holistically. Without a single customer view in place, financial and other institutions might struggle to address customer needs, creating negative experiences.

    To combat this issue, organizations need their data to move across systems in real-time feeds. Lags in data processing create missed customer opportunities if employees cannot access the latest view of up-to-date information. However, the right technologies can take fragmented data and make it accessible to individuals across a company, giving multiple employees comprehensive views of timely data.

    Outdated Data Impacts Employee Workloads

    With data constantly evolving, organizations need to implement effective Data Management systems to ensure employees are equipped with the time and knowledge they need to navigate through data seamlessly. Data can become outdated at a fast rate, and manually monitoring for these changes requires sustained energy from employees, which can prevent them from utilizing their time and talents in more productive ways. This can lead to burnout and generate retention issues. 

    Tools like artificial intelligence, entity resolution, and network generation can solve this by updating datasets in real time, giving employees more time to manage their workloads, conduct investigations, and pursue efforts to create stellar customer experiences. Not only do these technologies help improve employee routines, but they are also the key to cleaning up data, catching fraud, and enabling organizations to avoid regulatory and compliance issues.

    Regulatory Scrutiny and Compliance Issues

    The aforementioned study found that nearly half of respondents experienced issues with regulatory scrutiny and compliance efforts as a result of the data decision gap. This comes as no surprise given that organizations are required to have appropriate controls on data, especially in industries like financial services.

    Within financial services, regulators are enforcing stricter rules for organizations to remain compliant with their Anti-Money Laundering (AML) and Know Your Customer (KYC) models. While teams may attempt to keep customer records up to date by leveraging different systems, the underlying problem is data lineage and data quality. When regulators see any inconsistencies in a company’s data, they impose costly fines or freezes in operations until the data is sorted, creating major setbacks both internally and externally. 

    Inconsistencies in data create a lack of trust, which can spark differing views around company operations. This leads to discussions over issues that could have been better managed if a more comprehensive and accessible view of data had been available from the outset. 

    Final Thoughts

    In a world where data will continue to grow exponentially over the next several years, organizations must work to overcome the data decision gap. Organizations will always face challenges as internal and external circumstances continue to evolve, but by adopting technologies and processes to ensure data is always reflective of the latest developments, they can make the best possible decisions.

    Author: Dan Onions

    Source: Dataversity

  • Data alone is not enough, storytelling matters - part 1

    Data alone is not enough, storytelling matters - part 1

    Crafting meaningful narratives from data is a critical skill for all types of decision making, in business, and in our public discourse

    As companies connect decision-makers with advanced analytics at all levels of their organizations, they need both professional and citizen data scientists who can extract value from that data and share. These experts help develop process-driven data workflows, ensuring employees can make predictive decisions and get the greatest possible value from their analytics technologies.

    But understanding data and communicating its value to others are two different skill sets. Your team members’ ability to do the latter impacts the true value you get from your analytics investment. This can work for or against your long-term decision-making and will shape future business success.

    There are between stories and their ability to guide people’s decisions, even in professional settings. Sharing data in a way that adds value to decision-making processes still requires a human touch. This is true even when that data comes in the form of insights from advanced analytics.

    That’s why data storytelling is such a necessary activity. Storytellers convert complex datasets into full and meaningful narratives, rich with visualizations that help guide all types of business decisions. This can happen at all levels of the organization with the right tools, skill sets, and workflows in place. This article highlights the importance of data storytelling in enterprise organizations and illustrates the value of the narrative in decision-making processes.

    What is data storytelling?

    Data storytelling is an acquired skill. Employees who have mastered it can make sense out of a body of data and analytics insights, then convey their wisdom via narratives that make sense to other team members. This wisdom helps guide decision making in an honest, accurate, and valuable way.

    Reporting that provides deep, data-driven context beyond the static data views and visualizations is a structured part of a successful analytic lifecycle. There are three structural elements of data storytelling that contribute to its success:

    • Data: Data represents the raw material of any narrative. Storytellers must connect the dots using insights from data to create a meaningful, compelling story for decision-makers.
    • Visualization: Visualization is a way to accurately share data in the context of a narrative. Charts, graphs, and other tools “can enlighten the audience to insights that they wouldn’t see without [them],” Forbes observes, where insights might otherwise remain hidden to the untrained eye.

    • NarrativeA narrative enables the audience to understand the business and emotional importance of the storyteller’s findings. A compelling narrative helps boost decision-making and instills confidence in decision-makers.

    In the best cases, storytellers can craft and automate engaging, dynamic narrative reports using the very same platform they use to prepare data models and conduct advanced analytics inquiries. Processes may be automated so that storytellers can prepare data models and conduct inquiries easily as they shape their narrative. But whether the storyteller has access to a legacy or modern business intelligence (BI)platform , it’s the storyteller and his or her capabilities that matter most.

    Who are your storytellers?

    "The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it - that’s going to be a hugely important skill in the next decades."

    Hal R. Varian, Chief Economist, Google, 2009

    The history of analytics has been shaped by technical experts, where companies prioritized data scientists who can identify and understand raw information and process insights themselves. But as business became more data-driven, the need for insights spread across the organization. Business success called for more nuanced approaches to analysis and required broader access to analytics capabilities.

    Now, organizations more often lack the storytelling skill set - the ability to bridge the gap between analytics and business value. Successful storytellers embody this 'bridge' as a result of their ability to close the gap between analytics and business decision-makers at all levels of the organization.

    Today, a person doesn’t need to be a professional data scientist to master data storytelling. 'Citizen data scientists' can master data storytelling in the context of their or their team’s decision-making roles. In fact, the best storytellers have functional roles that equip them with the right vocabulary to communicate with their peers. It’s this “last mile” skill that makes the difference between information and results.

    Fortunately, leading BI platforms provide more self-service capabilities than ever, enabling even nontechnical users to access in-depth insights appropriate to their roles and skill levels. More than ever, employees across business functions can explore analytics data and hone their abilities in communicating its value to others. The question is whether or not you can trust your organization to facilitate their development.

    This is the end of part 1 of this article. To continue reading, you can find part 2 here.

    Author: Omri Kohl 

    Source: Pyramid Analytics

  • Data alone is not enough, storytelling matters - part 2

    Data alone is not enough, storytelling matters - part 2

    This article comprises the second half of a 2 part piece. Be sure to read part 1 before reading this article.

    Three common mistakes in data storytelling

    Of course, there are both opportunities and risks when using narratives and emotions to guide decision-making. Using a narrative to communicate important data and its context means listeners are one-step removed from the insights analytics provide.

    These risks became realities in the public discourse surrounding the 2020 global COVID-19 pandemic. Even as scientists recommended isolation and social distancing to ´flatten the curve´ - low the spread of infection - fears of an economic recession grew rampant. Public figures often overlooked inconvenient medical data in favor of narratives that might reactivate economic activity, putting lives at risk.

    Fortunately, some simple insights into human behavior can help prevent large-scale mistakes. Here are three common ways storytellers make mistakes when they employ a narrative, along with a simple use case to illustrate each example:

    • 'Objective' thinking: In this case, the storyteller focuses on an organizational objective instead of the real story behind the data. This might also be called cognitive bias. It’s characterized by the storyteller approaching data with an existing assumption rather than a question. The analyst therefore runs the risk of picking data that appears to validate that assumption and overlooking data that does not.

      Imagine a retailer who wants to beat its competitor’s customer service record. Business leaders task their customer experience professionals with proving this is the case. Resolute on meeting expectations, those analysts might omit certain data that doesn’t tip the results in favor of the desired outcome.

    • 'Presentative' thinking: In this case, the storyteller focuses on the means by which he or she presents the findings - such as a data visualization method - at risk of misleading, omitting, or watering down the data. The storyteller may favor a visualization that is appealing to his or her audience at the expense of communicating real value and insights.

      Consider an example from manufacturing. Imagine a storyteller preparing a narrative about productivity for an audience that prefers quantitative data visualization. That storyteller might show, accurately, that production and sales have increased but omit qualitative data analysis featuring important customer feedback.

    • 'Narrative' thinking: In this case, the storyteller creates a narrative for the narrative’s sake, even when it does not align well with the data. This often occurs when internal attitudes have codified a specific narrative about, say, customer satisfaction or performance.

      During the early days of testing for COVID-19, the ratio of critical cases to mild ones appeared high because not everyone infected had been tested. Despite the lack of data, this quickly solidified a specific media narrative about the lethality of the disease.

    Business leaders must therefore focus on maximizing their 'insight-to-value conversion rate', as Forbes describes it, where data storytelling is both compelling enough to generate action and valuable enough for that action to yield positive business results. Much of this depends on business leaders providing storytellers with the right tools, but it also requires encouragement that sharing genuine and actionable insights is their top priority.

    Ensuring storytelling success

    “Numbers have an important story to tell. They rely on you to give them a clear and convincing voice.”

    Stephen Few, Founder & Principal, Perceptual Edge®

    So how can your practical data scientists succeed in their mission: driving positive decision-making with narratives that accurately reflect the story behind the data your analytics provide? Here are some key tips to relay to your experts:

    • Involve stakeholders in the narrative’s creation. Storytellers must not operate in a vacuum. Ensure stakeholders understand and value the narrative before its official delivery.

    • Ensure the narrative ties directly to analytics data. Remember, listeners are a step removed from the insights your storytellers access. Ensure all their observations and visualizations have their foundations in the data.

    • Provide deep context with dynamic visualizations and content. Visualizations are building blocks for your narrative. With a firm foundation in your data, each visualization should contribute honestly and purposefully to the narrative itself.

    • Deliver contextualized insights. 'Know your audience' is a key tenant in professional writing, and it’s equally valuable here. Ensure your storytellers understand how listeners will interpret certain insights and findings and be ready to clarify for those who might not understand.

    • Guide team members to better decisions. Ensure your storytellers understand their core objective - to contribute honestly and purposefully to better decision-making among their audience members.

    As citizen data science becomes more common, storytellers and their audience of decision-makers are often already on the same team. That’s why self-service capabilities, contextual dashboards, and access to optimized insights have never been so critical to empowering all levels of the organization.

    Getting started: creating a culture of successful storytelling

    Insights are only valuable when shared - and they’re only as good as your team’s ability to drive decisions with them in a positive way. It’s data storytellers who bridge the gap from pure analytics insights to the cognitive and emotional capacities that regularly guide decision-making among stakeholders. As you might have gleaned from our two COVID-19 scenarios, outcomes are better when real data, accurate storytelling, and our collective capacities are aligned.

    But storytellers still need access to the right tools and contextual elements to bridge that gap successfully. Increasing business users’ access to powerful analytics tools is your first step towards data storytelling success. That means providing your teams with an analytics platform that adds meaning and value to business decisions, no matter their level in your organization.

    If you haven´t read part 1 of this article yet, you can find it here.

    Author: Omri Kohl

    Source: Pyramid Analytics

  • Data as a universal language

    Data as a universal language

    You don’t have to look very far to recognize the importance of data analytics in our world; from the weather channel using historical weather patterns to predict the summer, to a professional baseball team using on-base plus slugging percentage to determine who is more deserving of playing time, to Disney using films’ historical box office data to nail down the release date of its next Star Wars film.

    Data shapes our daily interactions with everything, from the restaurants we eat at, to the media we watch and the things that we buy. Data defines how businesses engage with their customers, using website visits, store visits, mobile check-ins and more to create a customer profile that allows them to tailor their future interactions with you. Data enhances how we watch sports, such as the world cup where broadcasters share data about players’ top speed and how many miles they run during the match. Data is also captured to remind us how much time we are wasting on our mobile devices, playing online games or mindlessly scrolling through Instagram.

    The demand for data and the ability to analyze it has also created an entire new course of study at universities around the world, as well as a career path that is currently among the fastest growing and most sought-after skillsets. While data scientists are fairly common and chief data officer is one of the newest executive roles focused on data-related roles and responsibilities, data analytics no longer has to be exclusive to specialty roles or the overburdened IT department. 

    And really, what professional can’t benefit from actionable intelligence?

    Businesses with operations across the country or around the world benefit from the ability to access and analyze a common language that drives better decision making.  An increasing number of these businesses recognize that they are creating volumes of data that have value, and even more important perhaps, the need for a centralized collection system for the information so they use the data to be more efficient and improve their chances for success.

    Sales teams, regardless of their location, can use centrally aggregated customer data to track purchasing behavior, develop pricing strategies to increase loyalty, and identify what products are purchased most frequently in order to offer complementary solutions to displace competitors.

    Marketing teams can use the same sales data to develop focused campaigns that are based on real experiences with their customers, while monitoring their effectiveness in order to make needed adjustments and or improve future engagement.

    Inventory and purchasing can use the sales data to improve purchasing decisions, ensure inventory is at appropriate levels and better manage slow moving and dead stock to reduce the financial impact on the bottom line.

    Branch managers can use the same data to focus on their own piece of the business, growing loyalty among their core customers and tracking their sales peoples’ performance.

    Accounts receivables can use the data to focus their efforts on the customers that need the most attention in terms of collecting outstanding invoices. And integrating the financial data with operational data paints a more complete picture of performance for financial teams and executives responsible for reporting and keeping track of the bottom line.

    Data ties all of the disciplines and departments together regardless of their locations. While some may care more about product SKUs than P&L statements or on-time-in-full deliveries, they can all benefit from a single source of truth that turns raw data into visual, easy-to-read charts, graphs and tables.

    The pace, competition and globalization of business make it critical for your company to use data to your advantage, which means moving away from gut feel or legacy habits to basing key decisions on the facts found in your ERP, CRM, HR, marketing and accounting systems. With the right translator, or data analytics software, the ability to use your data based on roles and responsibilities to improve sales and marketing strategies, customer relationships, stock and inventory management, financial planning and your corporate performance, can be available to all within your organization, making data a true universal language.

    Source: Phocas Software

  • Data interpretation: what is it and how to get value out of it? Part 1

    Data interpretation: what is it and how to get value out of it? Part 1

    Data analysis and interpretation have now taken center stage with the advent of the digital age… and the sheer amount of data can be frightening. In fact, a Digital Universe study found that the total data supply in 2012 was 2.8 trillion gigabytes! Based on that amount of data alone, it is clear the calling card of any successful enterprise in today’s global world will be the ability to analyze complex data, produce actionable insights and adapt to new market needs… all at the speed of thought.

    Business dashboards are the digital age tools for big data. Capable of displaying key performance indicators (KPIs) for both quantitative and qualitative data analyses, they are ideal for making the fast-paced and data-driven market decisions that push today’s industry leaders to sustainable success. Through the art of streamlined visual communication, data dashboards permit businesses to engage in real-time and informed decision-making and are key instruments in data interpretation. First of all, let’s find a definition to understand what lies behind data interpretation meaning.

    What Is Data Interpretation?

    Data interpretation refers to the process of using diverse analytical methods to review data and arrive at relevant conclusions. The interpretation of data helps researchers to categorize, manipulate, and summarize the information in order to answer critical questions.

    The importance of data interpretation is evident and this is why it needs to be done properly. Data is very likely to arrive from multiple sources and has a tendency to enter the analysis process with haphazard ordering. Data analysis tends to be extremely subjective. That is to say, the nature and goal of interpretation will vary from business to business, likely correlating to the type of data being analyzed. While there are several different types of processes that are implemented based on individual data nature, the two broadest and most common categories are “quantitative analysis” and “qualitative analysis”.

    Yet, before any serious data interpretation inquiry can begin, it should be understood that visual presentations of data findings are irrelevant unless a sound decision is made regarding scales of measurement. Before any serious data analysis can begin, the scale of measurement must be decided for the data as this will have a long-term impact on data interpretation ROI. The varying scales include:

    • Nominal Scale: non-numeric categories that cannot be ranked or compared quantitatively. Variables are exclusive and exhaustive.
    • Ordinal Scale: exclusive categories that are exclusive and exhaustive but with a logical order. Quality ratings and agreement ratings are examples of ordinal scales (i.e., good, very good, fair, etc., OR agree, strongly agree, disagree, etc.).
    • Interval: a measurement scale where data is grouped into categories with orderly and equal distances between the categories. There is always an arbitrary zero point.
    • Ratio: contains features of all three.

    Once scales of measurement have been selected, it is time to select which of the two broad interpretation processes will best suit your data needs. Let’s take a closer look at those specific data interpretation methods and possible data interpretation problems.

    How To Interpret Data?

    When interpreting data, an analyst must try to discern the differences between correlation, causation, and coincidences, as well as much other bias – but he also has to consider all the factors involved that may have led to a result. There are various data interpretation methods one can use.

    The interpretation of data is designed to help people make sense of numerical data that has been collected, analyzed, and presented. Having a baseline method (or methods) for interpreting data will provide your analyst teams with a structure and consistent foundation. Indeed, if several departments have different approaches to interpret the same data while sharing the same goals, some mismatched objectives can result. Disparate methods will lead to duplicated efforts, inconsistent solutions, wasted energy, and inevitably – time and money. In this part, we will look at the two main methods of interpretation of data: a qualitative and quantitative analysis.

    Qualitative Data Interpretation

    Qualitative data analysis can be summed up in one word – categorical. With qualitative analysis, data is not described through numerical values or patterns, but through the use of descriptive context (i.e., text). Typically, narrative data is gathered by employing a wide variety of person-to-person techniques. These techniques include:

    • Observations: detailing behavioral patterns that occur within an observation group. These patterns could be the amount of time spent in an activity, the type of activity, and the method of communication employed.
    • Focus groups: Group people and ask them relevant questions to generate a collaborative discussion about a research topic.
    • Secondary Research: much like how patterns of behavior can be observed, different types of documentation resources can be coded and divided based on the type of material they contain.
    • Interviews: one of the best collection methods for narrative data. Inquiry responses can be grouped by theme, topic, or category. The interview approach allows for highly-focused data segmentation.

    A key difference between qualitative and quantitative analysis is clearly noticeable in the interpretation stage. Qualitative data, as it is widely open to interpretation, must be “coded” so as to facilitate the grouping and labeling of data into identifiable themes. As person-to-person data collection techniques can often result in disputes pertaining to proper analysis, qualitative data analysis is often summarized through three basic principles: notice things, collect things, think about things.

    Quantitative Data Interpretation

    If quantitative data interpretation could be summed up in one word (and it really can’t) that word would be “numerical.” There are few certainties when it comes to data analysis, but you can be sure that if the research you are engaging in has no numbers involved, it is not quantitative research. Quantitative analysis refers to a set of processes by which numerical data is analyzed. More often than not, it involves the use of statistical modeling such as standard deviation, mean and median. Let’s quickly review the most common statistical terms:

    • Mean: a mean represents a numerical average for a set of responses. When dealing with a data set (or multiple data sets), a mean will represent a central value of a specific set of numbers. It is the sum of the values divided by the number of values within the data set. Other terms that can be used to describe the concept are arithmetic mean, average and mathematical expectation.
    • Standard deviation: this is another statistical term commonly appearing in quantitative analysis. Standard deviation reveals the distribution of the responses around the mean. It describes the degree of consistency within the responses; together with the mean, it provides insight into data sets.
    • Frequency distribution: this is a measurement gauging the rate of a response appearance within a data set. When using a survey, for example, frequency distribution has the capability of determining the number of times a specific ordinal scale response appears (i.e., agree, strongly agree, disagree, etc.). Frequency distribution is extremely keen in determining the degree of consensus among data points.

    Typically, quantitative data is measured by visually presenting correlation tests between two or more variables of significance. Different processes can be used together or separately, and comparisons can be made to ultimately arrive at a conclusion. Other signature interpretation processes of quantitative data include:

    • Regression analysis: Essentially, regression analysis uses historical data to understand the relationship between a dependent variable and one or more independent variables. Knowing which variables are related and how they developed in the past allows you to anticipate possible outcomes and make better decisions going forward. For example, if you want to predict your sales for next month you can use regression analysis to understand what factors will affect them such as products on sale, the launch of a new campaign, among many others. 
    • Cohort analysis: This method identifies groups of users who share common characteristics during a particular time period. In a business scenario, cohort analysis is commonly used to understand different customer behaviors. For example, a cohort could be all users who have signed up for a free trial on a given day. An analysis would be carried out to see how these users behave, what actions they carry out, and how their behavior differs from other user groups.
    • Predictive analysis: As its name suggests, the predictive analysis method aims to predict future developments by analyzing historical and current data. Powered by technologies such as artificial intelligence and machine learning, predictive analytics practices enable businesses to spot trends or potential issues and plan informed strategies in advance.
    • Prescriptive analysis: Also powered by predictions, the prescriptive analysis method uses techniques such as graph analysis, complex event processing, neural networks, among others, to try to unravel the effect that future decisions will have in order to adjust them before they are actually made. This helps businesses to develop responsive, practical business strategies.
    • Conjoint analysis: Typically applied to survey analysis, the conjoint approach is used to analyze how individuals value different attributes of a product or service. This helps researchers and businesses to define pricing, product features, packaging, and many other attributes. A common use is menu-based conjoint analysis in which individuals are given a “menu” of options from which they can build their ideal concept or product. Like this analysts can understand which attributes they would pick above others and drive conclusions.
    • Cluster analysis: Last but not least, cluster analysis is a method used to group objects into categories. Since there is no target variable when using cluster analysis, it is a useful method to find hidden trends and patterns in the data. In a business context clustering is used for audience segmentation to create targeted experiences, and in market research, it is often used to identify age groups, geographical information, earnings, among others.

    Now that we have seen how to interpret data, let's move on and ask ourselves some questions: what are some data interpretation benefits? Why do all industries engage in data research and analysis? These are basic questions, but they often don’t receive adequate attention.

    Why Data Interpretation Is Important

    The purpose of collection and interpretation is to acquire useful and usable information and to make the most informed decisions possible. From businesses to newlyweds researching their first home, data collection and interpretation provides limitless benefits for a wide range of institutions and individuals.

    Data analysis and interpretation, regardless of the method and qualitative/quantitative status, may include the following characteristics:

    • Data identification and explanation
    • Comparing and contrasting of data
    • Identification of data outliers
    • Future predictions

    Data analysis and interpretation, in the end, help improve processes and identify problems. It is difficult to grow and make dependable improvements without, at the very least, minimal data collection and interpretation. What is the keyword? Dependable. Vague ideas regarding performance enhancement exist within all institutions and industries. Yet, without proper research and analysis, an idea is likely to remain in a stagnant state forever (i.e., minimal growth). So… what are a few of the business benefits of digital age data analysis and interpretation? Let’s take a look!

    1) Informed decision-making: A decision is only as good as the knowledge that formed it. Informed data decision-making has the potential to set industry leaders apart from the rest of the market pack. Studies have shown that companies in the top third of their industries are, on average, 5% more productive and 6% more profitable when implementing informed data decision-making processes. Most decisive actions will arise only after a problem has been identified or a goal defined. Data analysis should include identification, thesis development, and data collection followed by data communication.

    If institutions only follow that simple order, one that we should all be familiar with from grade school science fairs, then they will be able to solve issues as they emerge in real-time. Informed decision-making has a tendency to be cyclical. This means there is really no end, and eventually, new questions and conditions arise within the process that needs to be studied further. The monitoring of data results will inevitably return the process to the start with new data and sights.

    2) Anticipating needs with trends identification: data insights provide knowledge, and knowledge is power. The insights obtained from market and consumer data analyses have the ability to set trends for peers within similar market segments. A perfect example of how data analysis can impact trend prediction can be evidenced in the music identification application, Shazam. The application allows users to upload an audio clip of a song they like, but can’t seem to identify. Users make 15 million song identifications a day. With this data, Shazam has been instrumental in predicting future popular artists.

    When industry trends are identified, they can then serve a greater industry purpose. For example, the insights from Shazam’s monitoring benefits not only Shazam in understanding how to meet consumer needs, but it grants music executives and record label companies an insight into the pop-culture scene of the day. Data gathering and interpretation processes can allow for industry-wide climate prediction and result in greater revenue streams across the market. For this reason, all institutions should follow the basic data cycle of collection, interpretation, decision making, and monitoring.

    3) Cost efficiency: Proper implementation of data analysis processes can provide businesses with profound cost advantages within their industries. A recent data study performed by Deloitte vividly demonstrates this in finding that data analysis ROI is driven by efficient cost reductions. Often, this benefit is overlooked because making money is typically viewed as “sexier” than saving money. Yet, sound data analyses have the ability to alert management to cost-reduction opportunities without any significant exertion of effort on the part of human capital.

    A great example of the potential for cost efficiency through data analysis is Intel. Prior to 2012, Intel would conduct over 19,000 manufacturing function tests on their chips before they could be deemed acceptable for release. To cut costs and reduce test time, Intel implemented predictive data analyses. By using historic and current data, Intel now avoids testing each chip 19,000 times by focusing on specific and individual chip tests. After its implementation in 2012, Intel saved over $3 million in manufacturing costs. Cost reduction may not be as “sexy” as data profit, but as Intel proves, it is a benefit of data analysis that should not be neglected.

    4) Clear foresight: companies that collect and analyze their data gain better knowledge about themselves, their processes, and performance. They can identify performance challenges when they arise and take action to overcome them. Data interpretation through visual representations lets them process their findings faster and make better-informed decisions on the future of the company.

    This concludes part 1 of the article. Interested in the remainder of the article? Read part 2 here!

    Author: Bernardita Calzon

    Source: Datapine

  • Data interpretation: what is it and how to get value out of it? Part 2

    Data interpretation: what is it and how to get value out of it? Part 2

    If you haven't read part 1 of this article yet, you can find it here!

    Common Data Analysis And Interpretation Problems

    The oft-repeated mantra of those who fear data advancements in the digital age is “big data equals big trouble.” While that statement is not accurate, it is safe to say that certain data interpretation problems or “pitfalls” exist and can occur when analyzing data, especially at the speed of thought. Let’s identify some of the most common data misinterpretation risks and shed some light on how they can be avoided:

    1) Correlation mistaken for causation: our first misinterpretation of data refers to the tendency of data analysts to mix the cause of a phenomenon with correlation. It is the assumption that because two actions occurred together, one caused the other. This is not accurate as actions can occur together absent a cause and effect relationship.

    • Digital age example: assuming that increased revenue is the result of increased social media followers… there might be a definitive correlation between the two, especially with today’s multi-channel purchasing experiences. But, that does not mean an increase in followers is the direct cause of increased revenue. There could be both a common cause or an indirect causality.
    • Remedy: attempt to eliminate the variable you believe to be causing the phenomenon.

    2) Confirmation bias: our second data interpretation problem occurs when you have a theory or hypothesis in mind but are intent on only discovering data patterns that provide support to it while rejecting those that do not.

    • Digital age example: your boss asks you to analyze the success of a recent multi-platform social media marketing campaign. While analyzing the potential data variables from the campaign (one that you ran and believe performed well), you see that the share rate for Facebook posts was great, while the share rate for Twitter Tweets was not. Using only the Facebook posts to prove your hypothesis that the campaign was successful would be a perfect manifestation of confirmation bias.
    • Remedy: as this pitfall is often based on subjective desires, one remedy would be to analyze data with a team of objective individuals. If this is not possible, another solution is to resist the urge to make a conclusion before data exploration has been completed. Remember to always try to disprove a hypothesis, not prove it.

    3) Irrelevant data: the third data misinterpretation pitfall is especially important in the digital age. As large data is no longer centrally stored, and as it continues to be analyzed at the speed of thought, it is inevitable that analysts will focus on data that is irrelevant to the problem they are trying to correct.

    • Digital age example: in attempting to gauge the success of an email lead generation campaign, you notice that the number of homepage views directly resulting from the campaign increased, but the number of monthly newsletter subscribers did not. Based on the number of homepage views, you decide the campaign was a success when really it generated zero leads.
    • Remedy: proactively and clearly frame any data analysis variables and KPIs prior to engaging in a data review. If the metric you are using to measure the success of a lead generation campaign is newsletter subscribers, there is no need to review the number of homepage visits. Be sure to focus on the data variable that answers your question or solves your problem and not on irrelevant data.

    4) Truncating an Axes: When creating a graph to start interpreting the results of your analysis it is important to keep the axes truthful and avoid generating misleading visualizations. Starting the axes in a value that doesn’t portray the actual truth about the data can lead to false conclusions. 

    • Digital age example: In the image below we can see a graph from Fox News in which the Y-axes start at 34%, making it seem that the difference between 35% and 39.6% is way higher than it actually is. This could lead to a misinterpretation of the tax rate changes. 
    • Remedy: Be careful with the way your data is visualized. Be respectful and realistic with axes to avoid misinterpretation of your data. 

    5) (Small) sample size: Another common data analysis and interpretation problem is the use of a small sample size. Logically, the bigger the sample size the most accurate and reliable are the results. However, this also depends on the size of the effect of the study. For example, the sample size in a survey about the quality of education will not be the same as for one about people doing outdoor sports in a specific area. 

    • Digital age example: Imagine you ask 30 people a question and 29 answer “yes” resulting in 95% of the total. Now imagine you ask the same question to 1000 and 950 of them answer “yes”, which is again 95%. While these percentages might look the same, they certainly do not mean the same thing as a 30 people sample size is not a significant number to establish a truthful conclusion. 
    • Remedy: Researchers say that in order to determine the correct sample size to get truthful and meaningful results it is necessary to define a margin of error that will represent the maximum amount they want the results to deviate from the statistical mean. Paired to this, they need to define a confidence level that should be between 90 and 99%. With these two values in hand, researchers can calculate an accurate sample size for their studies.

    6) Reliability, subjectivity, and generalizability: When performing qualitative analysis, researchers must consider practical and theoretical limitations when interpreting the data. In some cases, qualitative research can be considered unreliable because of uncontrolled factors that might or might not affect the results. This is paired with the fact that the researcher has a primary role in the interpretation process, meaning he or she decides what is relevant and what is not, and as we know, interpretations can be very subjective.

    Generalizability is also an issue that researchers face when dealing with qualitative analysis. As mentioned in the point about small sample size, it is difficult to draw conclusions that are 100% representative because the results might be biased or unrepresentative of a wider population. 

    While these factors are mostly present in qualitative research, they can also affect quantitative analysis. For example, when choosing which KPIs to portray and how to portray them, analysts can also be biased and represent them in a way that benefits their analysis.

    • Digital age example: Biased questions in a survey are a great example of reliability and subjectivity issues. Imagine you are sending a survey to your clients to see how satisfied they are with your customer service with this question: “how amazing was your experience with our customer service team?”. Here we can see that this question is clearly influencing the response of the individual by putting the word “amazing” on it. 
    • Remedy: A solution to avoid these issues is to keep your research honest and neutral. Keep the wording of the questions as objective as possible. For example: “on a scale of 1-10 how satisfied were you with our customer service team”. This is not leading the respondent to any specific answer, meaning the results of your survey will be reliable. 

    Data Interpretation Techniques and Methods

    Data analysis and interpretation are critical to developing sound conclusions and making better-informed decisions. As we have seen with this article, there is an art and science to the interpretation of data. To help you with this purpose here we will list a few relevant data interpretation techniques, methods, and tricks you can implement for a successful data management process. 

    As mentioned at the beginning of this post, the first step to interpret data in a successful way is to identify the type of analysis you will perform and apply the methods respectively. Clearly differentiate between qualitative analysis (observe, document, and interview notice, collect and think about things) and quantitative analysis (you lead research with a lot of numerical data to be analyzed through various statistical methods). 

    1) Ask the right data interpretation questions

    The first data interpretation technique is to define a clear baseline for your work. This can be done by answering some critical questions that will serve as a useful guideline to start. Some of them include: what are the goals and objectives from my analysis? What type of data interpretation method will I use? Who will use this data in the future? And most importantly, what general question am I trying to answer?

    Once all this information has been defined, you will be ready to collect your data. As mentioned at the beginning of the post, your methods for data collection will vary depending on what type of analysis you use (qualitative or quantitative). With all the needed information in hand, you are ready to start the interpretation process, but first, you need to visualize your data. 

    2) Use the right data visualization type 

    Data visualizations such as business graphs, charts, and tables are fundamental to successfully interpreting data. This is because the visualization of data via interactive charts and graphs makes the information more understandable and accessible. As you might be aware, there are different types of visualizations you can use but not all of them are suitable for any analysis purpose. Using the wrong graph can lead to misinterpretation of your data so it’s very important to carefully pick the right visual for it. Let’s look at some use cases of common data visualizations. 

    • Bar chart: One of the most used chart types, the bar chart uses rectangular bars to show the relationship between 2 or more variables. There are different types of bar charts for different interpretations this includes the horizontal bar chart, column bar chart, and stacked bar chart. 
    • Line chart: Most commonly used to show trends, acceleration or decelerations, and volatility, the line chart aims to show how data changes over a period of time for example sales over a year. A few tips to keep this chart ready for interpretation is to not use many variables that can overcrowd the graph and keep your axis scale close to the highest data point to avoid making the information hard to read. 
    • Pie chart: Although it doesn’t do a lot in terms of analysis due to its uncomplex nature, pie charts are widely used to show the proportional composition of a variable. Visually speaking, showing a percentage in a bar chart is way more complicated than showing it in a pie chart. However, this also depends on the number of variables you are comparing. If your pie chart would need to be divided into 10 portions then it is better to use a bar chart instead. 
    • Tables: While they are not a specific type of chart, tables are wildly used when interpreting data. Tables are especially useful when you want to portray data in its raw format. They give you the freedom to easily look up or compare individual values while also displaying grand totals. 

    With the use of data visualizations becoming more and more critical for businesses’ analytical success, many tools have emerged to help users visualize their data in a cohesive and interactive way. One of the most popular ones is the use of BI dashboards. These visual tools provide a centralized view of various graphs and charts that paint a bigger picture about a topic. We will discuss more the power of dashboards for an efficient data interpretation practice in the next portion of this post. If you want to learn more about different types of data visualizations take a look at our complete guide on the topic. 

    3) Keep your interpretation objective

    As mentioned above, keeping your interpretation objective is a fundamental part of the process. Being the person closest to the investigation, it is easy to become subjective when looking for answers in the data. Some good ways to stay objective is to show the information to other people related to the study, for example, research partners or even the people that will use your findings once they are done. This can help avoid confirmation bias and any reliability issues with your interpretation. 

    4) Mark your findings and draw conclusions

    Findings are the observations you extracted out of your data. They are the facts that will help you drive deeper conclusions about your research. For example, findings can be trends and patterns that you found during your interpretation process. To put your findings into perspective you can compare them with other resources that used similar methods and use them as benchmarks.

    Reflect on your own thinking and reasoning and be aware of the many pitfalls data analysis and interpretation carries. Correlation versus causation, subjective bias, false information, and inaccurate data, etc. Once you are comfortable with your interpretation of the data you will be ready to develop conclusions, see if your initial question were answered, and suggest recommendations based on them.

    Interpretation of Data: The Use of Dashboards Bridging The Gap

    As we have seen, quantitative and qualitative methods are distinct types of data analyses. Both offer a varying degree of return on investment (ROI) regarding data investigation, testing, and decision-making. Because of their differences, it is important to understand how dashboards can be implemented to bridge the quantitative and qualitative information gap. How are digital data dashboard solutions playing a key role in merging the data disconnect? Here are a few of the ways:

    1) Connecting and blending data. With today’s pace of innovation, it is no longer feasible (nor desirable) to have bulk data centrally located. As businesses continue to globalize and borders continue to dissolve, it will become increasingly important for businesses to possess the capability to run diverse data analyses absent the limitations of location. Data dashboards decentralize data without compromising on the necessary speed of thought while blending both quantitative and qualitative data. Whether you want to measure customer trends or organizational performance, you now have the capability to do both without the need for a singular selection.

    2) Mobile Data. Related to the notion of “connected and blended data” is that of mobile data. In today’s digital world, employees are spending less time at their desks and simultaneously increasing production. This is made possible by the fact that mobile solutions for analytical tools are no longer standalone. Today, mobile analysis applications seamlessly integrate with everyday business tools. In turn, both quantitative and qualitative data are now available on-demand where they’re needed, when they’re needed, and how they’re needed via interactive online dashboards.

    3) Visualization. Data dashboards are merging the data gap between qualitative and quantitative methods of interpretation of data, through the science of visualization. Dashboard solutions come “out of the box” well-equipped to create easy-to-understand data demonstrations. Modern online data visualization tools provide a variety of color and filter patterns, encourage user interaction, and are engineered to help enhance future trend predictability. All of these visual characteristics make for an easy transition among data methods – you only need to find the right types of data visualization to tell your data story the best way possible.

    To Conclude…

    As we reach the end of this insightful post about data interpretation and analysis we hope you have a clear understanding of the topic. We've covered the data interpretation definition, given some examples and methods to perform a successful interpretation process.

    The importance of data interpretation is undeniable. Dashboards not only bridge the information gap between traditional data interpretation methods and technology, but they can help remedy and prevent the major pitfalls of interpretation. As a digital age solution, they combine the best of the past and the present to allow for informed decision-making with maximum data interpretation ROI.

    Author: Bernardita Calzon

    Source: Datapine

  • Data management: building the bridge between IT and business

    Data management: building the bridge between IT and business

    We all know businesses are trying to do more with their data, but inaccuracy and general data management issues are getting in the way. For most businesses, the status quo for managing data is not always working. However, tnew research shows that data is moving from a knee-jerk, “must be IT’s issue” conversation, to a “how can the business better leverage this rich (and expensive) data resource we have at our fingertips” conversation.

    The emphasis is on “conversation”, business and IT need to communicate in the new age of Artificial Intelligence, Machine Learning and Interactive Analytics. Roles and responsibilities are blurring, and it is expected that a company’s data will quickly turn from a cost-center of IT infrastructure to a revenue-generator for the business. In order to address the issues of control and poor data quality, there needs to be an ever-increasing bridge between IT and the business. This bridge has two component parts. The first one is technology, which is both sophisticated enough to handle complex data issues but easy enough to provide a quick time-to-value. The second one is people who are able to bridge the gap between IT systems/storage/access items and business users need for value and results (enter data analysts and data engineers).

    This bridge needs to be built with three key components in mind:

    • Customer experience:

      For any B2C company, customer experience is the number one hot topic of the day and a primary way they are leveraging data. A new 2019 data management benchmark report found that 98% of companies use data to improve customer experience. And for good reason, between social media, digital streaming services, online retailers and others, companies are looking to show the consumer that they aren’t just a corporation, but that they are the corporation most worthy of building a relationship with. This invariably involves creating a single view of the customer (SVC), and  that view needs to be built around context and based on the needs of the specific department within the business (accounts payable, marketing, customer service, etc.).
    • Trust in data:

      Possessing data and trusting data are two completely different things. Lots of companies have lots of data, but that doesn’t mean they automatically trust it enough to make business-critical decisions with it. Research finds that on average, organizations suspect 29% of current customer/prospect data is inaccurate in some way. In addition, 95% of organizations see impacts in their organization from poor quality data. A lack of trust in the data available to business users paralyzes decisions, and even worse, impacts the ability to make the right decisions based on faulty assumptions. How often have you received a report and questioned the results? More than you’d like to admit, probably. To get around this hurdle, organizations need to drive culture change around data quality strategies and methodologies. Only by completing a full assessment of data, developing a strategy to address the existing and ongoing issues, and implementing a methodology to execute on that strategy, will companies be able to turn the corner from data suspicion to data trust.
    • Changing data ownership:

      The responsibilities between IT and the business are blurring. 70% of businesses say that not having direct control over data impacts their ability to meet strategic objectives. The reality is that the definitions of control are throwing people off. IT thinks of control as storage, systems, and security. The business thinks of control as access, actionable and accurate. The role of the CDO is helping to bridge this gap, bringing the nuts and bolts of IT in line with the visions and aspirations of the business.

    The bottom line is that for most companies data is still a shifting sea of storage, software stacks, and stakeholders. The stakeholders are key, both from IT and the business, and in how the two can combine to provide the oxygen the business needs to survive: better customer experience, more personalization, and an ongoing trust in the data they administrate to make the best decisions to grow their companies and delight their customers.

    Author: Kevin McCarthy

    Source: Dataversity

  • Data storytelling: 5 best practices

    Data storytelling: 5 best practices

    Learn how to hone both your verbal and written communication skills to make your insights memorable and encourage decision-makers to revisit your research.

    You’ve spent months collecting data with your insights team or research vendors, and you’ve compiled your research into a presentation that you think is going to blow your audience away. But what happens after you’ve finished presenting? Do your stakeholders act on the insights you’ve shared, or do they move on to their next meeting and quickly forget your key takeaways and recommendations?

    If you want to avoid the latter, it’s important to consider how you can make the biggest possible impact while presenting and also encourage your stakeholders to revisit your research after the fact. And that requires you to hone both your verbal and written communication skills.

    In other words: practice your storytelling.

    Research shows that combining statistics with storytelling results in a retention rate of 65-70%. So, how do you take advantage of this fact when presenting and documenting your insights?

    Below are five best practices to help you present insights through stories – and encourage your stakeholders to revisit those stories as they make business decisions.

    Tailor the message to your audience

    To maximize the impact of your story, you have to consider who’s hearing it.

    When you’re presenting to someone in finance, try to cover how your findings can help the company save money. When you’re talking to Marketing or Sales, explain how the information can drive new leads and close more deals. When you’re talking to the product development team, explain how they can deliver a better solution.

    The more you can address your audience’s concerns in the language they use and the context they understand, the bigger the impact your story will have.

    Ask yourself:

    1. How much does my audience already know about the subject I’m covering?
    2. How much time do they have to listen to what I’m saying?
    3. What are their primary concerns?
    4. What type of language do they use to communicate?
    5. Are there preconceptions I need to address?

    If your insights are applicable to multiple groups across the organization, it’s worth thinking about how you can tweak the story for each audience. This could mean writing different sets of key takeaways and implications for different groups or altering the examples you use to better align with each audience’s interests.

    Follow the structure of a story

    While stories come in various shapes, sizes, tones, and genres, they all have a few things in common – one of those being a similar structure.

    Think about how a movie is typically divided into three acts. Those acts follow this general structure:

    1. Setup: We’re introduced to the protagonist, and they experience some kind of inciting incident (i.e., the introduction of conflict or tension) that propels the story forward.
    2. Confrontation: The protagonist works to achieve a goal but encounters obstacles along the way.
    3. Resolution: The protagonist reaches the height of their conflict with an antagonist and achieves some kind of outcome (whether it’s the protagonist’s desired outcome or not will depend on the type of story).

    Here’s a (fictional) example of an insights-driven story that follows this structure:

    1. The insights team for a beverage company shares a recorded interview with a real customer, who we’ll call Raquel. Raquel talks about how she loves getting together for backyard barbecues with friends. She says that she used to always drink beer at these barbecues but has recently decided to stop drinking.
    2. Raquel goes on to say that she doesn’t really like soda because she thinks it’s too sweet, but she will often pick one up at barbecues because she wants to have a drink in her hand.
    3. After playing this interview, the insights team presents findings from their latest study into young women’s non-alcoholic beverage preferences. They use Raquel’s story to emphasize trends they are seeing for canned beverages with lower sugar or sweetener contents.

    By framing your data and reports in this narrative structure, you’re more likely to keep your audience interested, make your findings memorable, and emphasize how your findings relate to real customers or consumers. This is a great way to get business decision-makers to invest in and act on your insights.

    Put your editor’s hat on

    When you have managed or been directly involved with a research project, it can be tempting to include every fascinating detail in your presentation. However, if you throw extraneous information into your data story, you’ll quickly lose your audience. It’s important to put yourself in the mindset of your audience and ruthlessly edit your story down to its essential ingredients.

    According to Cinny Little, Principal Analyst at Forrester Research, you should focus on answering the audience’s two primary questions: “What’s in it for me?” and “Why do I need to care?”

    You should also keep your editor’s hat on when documenting your key recommendations or takeaways for a report. Studies show that people can only hold about four items in their conscious mind, or working memory, at any one time. If you include more than three or four recommendations, your audience will have a harder time retaining the most important information.

    Find your hook

    When presenting, don’t think you can start slow and build up excitement – research suggests you only have about 30 to 60 seconds to capture your audience’s attention. After that, you’ve lost them.

    And getting them back won’t be easy.

    That’s why you need a hook – a way to start your story that’s so engaging and compelling your audience can’t help but listen.

    According to Matthew Luhn, a writer, story consultant, and speaker who has experience working with Pixar, The Simpsons, and more, a compelling hook is at least one of the following:

    • Unusual
    • Unexpected
    • Action-filled
    • Driven by conflict

    When sharing your research, you could hook your audience by leading with a finding that goes against prevailing assumptions, or a specific example of a customer struggling with a problem that your product could solve. Find a hook that evokes a strong emotion so that your story will stick with listeners and drive them to make decisions.

    Experiment with your story medium

    If you present your research to a room (or Zoom meeting) full of stakeholders once and then move on, you’re limiting the reach, lifespan, and value of that research. At a time when so many teams have become decentralized and remote work is common, it’s more important than ever to preserve your data stories and make them accessible to your stakeholders on demand.

    At the most basic level, this could mean uploading your presentation decks to an insights management platform so that your stakeholders and team members can look them up whenever they want. However, it’s also worth thinking about other mediums you can translate your stories into. For example, you might publish infographics, video clips from customer interviews, or animated data visualizations alongside your reports. Think about the supporting materials you can include to bring the story to life for anyone who wasn’t in the room for the initial presentation.


    ​​By applying the best practices above, you can take the data and reports that others often find dry (no matter how much you disagree) and turn them into compelling, engaging, and persuasive stories.

    This process of developing and distributing insights stories will enable you and your team to have a more strategic impact on your company as a whole by demonstrating the potential outcomes of making decisions based on research.

    Author: Madeline Jacobson

    Source: Greenbook

  • DataOps and the path from raw to analytics-ready data

    DataOps and the path from raw to analytics-ready data

    For the first time in human history, we have access to the second-by-second creation of vast quantities of information from nearly every activity of human life. It’s a tectonic shift that’s transforming human society. And among the myriad impacts is an important one for every business: the shift in data users’ expectations. In the same way that the advent of smartphones triggered expectations of access and convenience, the explosion in data volume is now creating expectations of availability, speed, and readiness. The scalability of the internet of things (IoT), AI in the data center, and software-embedded machine learning are together generating an ever-growing demand in the enterprise for immediate, trusted, analytics-ready data from every source possible.

    It makes complete sense, since there’s a direct correlation between your business’s ability to deliver analytics-ready data and your potential to grow your business. But as every data manager knows, yesterday’s infrastructure wasn’t built to deliver on today’s demands. Traditional data pipelines using batch and extended cycles are not up to the task. Neither are the legacy processes and lack of coordination that grew out of the siloed way we’ve traditionally set up our organizations, where data scientists and analysts are separate from line-of-business teams.

    As a result, enterprises everywhere are suffering from a data bottleneck. You know there’s tremendous value in raw data, waiting to be tapped. And you understand that in today’s data-driven era, success and growth depend on your ability to leverage it for outcomes. But the integration challenges presented by multi-cloud architecture put you in a difficult position. How can you manage the vast influx of data into a streamlined, trusted, available state, in enough time to act? How can you go from raw to ready for all users, in every business area, to uncover insights when they’re most impactful? And perhaps most importantly, how can you make sure that your competitors don’t figure it all out first?

    The raw-to-ready data supply chain

    There’s good news for everyone struggling with this issue.

    First, the technology is finally here. Todays’ data integration solutions have the power to collect and interpret multiple data sets; eliminate information silos; democratize data access; and provide a consistent view of governed, real-time data to every user across the business. At the same time, the industry trend of consolidating data management and analytics functions into streamlined, end-to-end platforms is making it possible for businesses to advance the speed and the accuracy of data delivery. And that, in turn, is advancing the speed and accuracy of insights that can lead to new revenue creation.

    And second, we’re seeing the emergence of DataOps, a powerful new discipline that brings together people, processes, and technologies to optimize data pipelines for meeting today’s considerable demands. Through a combination of agile development methodology, rapid responses to user feedback, and continuous data integration, DataOps makes the data supply chain faster, more efficient, more reliable, and more flexible. As a result, modern data and analytics initiatives become truly scalable, and businesses can take even greater advantage of the data revolution to pull ahead.

    What is DataOps for analytics?

    Like DevOps before it, which ignited a faster-leaner-more-agile revolution in app development, DataOps accelerates the entire ingestion-to-insight analytics value chain. Also like DevOps, DataOps is neither a product nor a platform; it’s a methodology that encompasses the adoption of modern technologies, the processes that bring the data from its raw to ready state, and the teams that work with and use data.

    By using real-time integration technologies like change data capture and streaming data pipelines, DataOps disrupts how data is made available across the enterprise. Instead of relying on the stutter of batch orientation, it moves data in a real-time flow for shorter cycles. Additionally, DataOps introduces new processes for streamlining the interaction among data owners, database administrators, data engineers, and data consumers. In fact, DataOps ignites a collaboration mentality (and a big cultural change) among every role that touches data, ultimately permeating the entire organization.

    What does DataOps look like from a data-user perspective?

    In a subsequent post, I’ll delve more granularly into the technical and procedural components of DataOps for Analytics, looking at it from an operational perspective. For this post, where I want to highlight the business impact, I’ll start with a quick overview of what DataOps looks like from a data-user perspective.

    • All data, trusted, in one simplified view: Every data-user in the enterprise has 24/7 access to the data (and combinations of data) they need, in an intuitive and centralized marketplace experience. Analysts of every skill level can load, access, prepare, and analyze data in minutes without ever having to contact IT.
    • Ease of collaboration: It becomes faster and easier for data scientists and business analysts to connect and collaborate, and crowd-sourcing of key information. For example, the identification and surfacing of the most popular and reliable data sets becomes possible.
    • Reliability and accuracy: Because the data is governed and continuously updated, with all users drawing from the same data catalogue, trust is high, teams are aligned, and insights are reliable.
    • Automation: Users are freed to ask deeper questions sooner, thanks to the automation of key repeatable requests. And with AI-enabled technologies that suggest the best visualization options for a given data set, chart creation is faster and easier, too. Other AI technologies point users toward potential new insights to explore, prompting them to reach relevant and previously undiscovered insights.
    • Ease of reuse: Data sets do not have to be generated again and again, for every application, but rather can be reused as needs arise and relevance expands – from planning and strategy to forecasting and identifying future opportunities in an existing client base.
    • Increased data literacy: DataOps fosters the easiest kind of data literacy boost by automating, streamlining, and simplifying data delivery. Regardless of existing skill levels, every member of your team will find it much more intuitive to work with data that’s readily available and trusted. At the same time, DataOps buttresses the more active efforts of skills training by delivering reliable data in real time. Getting the right data to the right people at the right time keeps even the most advanced analysts moving forward in new directions.

     What are the business outcomes?

    In every era, speed has given businesses a competitive advantage. In the data-driven era, where consumers expect real-time experiences and where business advantage can be measured in fractions of a second, speed has become more valuable than ever. One of the fundamental advantages of DataOps for Analytics is the speed of quality data delivery. The faster you can get data from raw to ready (ready for analysis, monetization, and productization), the faster you can reap all the benefits data promises to deliver.

    But speed is just the beginning. By delivering governed, reliable, analytics-ready data from a vast array of sources to every user in the enterprise, the raw-to-ready data supply chain becomes an elegant lever for business transformation and growth. Here are four key areas where DataOps galvanizes transformation:

    1. Customer intelligence: With an agile data supply chain, you can much more efficiently use analytics to improve customer experience and drive increased lifetime value. Discover deeper customer insights faster, and use them to customize interactions; increase conversion; and build long-term, one-to-one customer relationships by offering personalized experiences at scale.
    2. Reimagined processes: Accelerating, streamlining, and automating your data pipelines enables teams across your organization to more quickly and effectively optimize every aspect of business for efficiency and productivity. This includes automating processes, reducing costs, optimizing the overall supply chain, freeing up scarce resources, improving field operations, and boosting performance.
    3. Balanced risk and reward: Nimble data-delivery empowers analytics users to get timely insight into internal and external factors to make faster, smarter decisions around risk. Leaders can manage production; keep data current, consistent, and in the right hands; and stay compliant while preparing for the future.
    4. New business opportunities: And finally, a raw-to-ready data supply chain gives you the power to develop new products, services, and revenue streams with insights gleaned from data and/or to monetize the data itself. This may be the most exciting opportunity we’re seeing with DataOps for Analytics today; it’s certainly the most transformative. For example, consider how storied American conglomerate GE has transformed a century-old business model (selling hardware) to create a digital platform for commodifying their data. And think about how tech behemoths like Amazon and Google have used their massive stores of data and agile analytics capabilities to attack and disrupt traditional markets like insurance, banking and retail.

    The heart of digital transformation

    If you’re launching or underway with strategic digital transformation programs for competitive viability and if you’re a CIO or CDO, data is the key. To thrive, your initiatives need an agile, integrated data and analytics ecosystem that provides a raw-to-ready data supply chain, accelerates time-to-insight, and enables a rapid test-and-learn cycle. That’s DataOps for Analytics, and it’s the dawn of a new era in the evolution of the data-driven organization.

    Author: Mike Capone

    Source: Qlik

  • De data transformatie van de grootste bierbrouwer ter wereld in beeld

    De data transformatie van de grootste bierbrouwer ter wereld in beeld

    's Werelds grootste brouwer van bier, AB InBev, had een groot probleem in het gebruik van data: het was overal maar niemand wist het te gebruiken. Dat is nu wel anders, na een belangrijke transformatie van de organisatie.

    Anheuser-Busch InBev, de brouwer van Budweiser, Corona en meer dan 500 andere biermerken, had een dataprobleem toen Harinder Singh in 2017 bij het bedrijf kwam als global director of data strategy and solution architecture.

    Mede dankzij de overnames van meer dan een dozijn biermerken in de afgelopen jaren, had AB InBev in meer dan 100 landen een overvloed aan gegevens verzameld die opgeslagen waren in on-premises en cloud systemen. Singh's doelstelling? De gegevens samenvoegen, verenigen en beschikbaar maken voor zakelijke gebruikers via één enkele 'lens'.

    'Mijn collega's hier vertellen me dat het denken aan technologie of data drie jaar geleden nog niet helemaal top of mind was', zegt Singh, die een gelijkaardige rol vervulde bij Walmart eCommerce voordat hij bij AB InBev ging werken. 'Bedrijfstransformatie moet mogelijk worden gemaakt door digitale transformatie en data is de kern'.

    Gegevens uit de pomp

    In het behandelen van 'zakelijke data als de nieuwe olie', zijn bedrijven bereid grof geld te betalen voor software die data kan opschonen en organiseren voor het verwerven van zakelijke inzichten, waardoor wereldwijd de inkomsten voor big data en business analytics software naar verwachting zo'n 260 miljard dollar zullen bereiken in 2022, volgens marktonderzoek van IDC.

    Maar als data de nieuwe olie is, is de integratie ervan het equivalent van het uit de grond halen, het in een digitaal vat stoppen en het klaar maken voor consumptie. Het probleem? Gegevens worden steeds meer versnipperd over organisaties, vooral omdat legacy-applicaties worden vervangen door nieuwe, losjes gekoppelde applicaties.

    De gegevens van AB InBev werden opgeslagen in meer dan 100 bronsystemen, 15 SAP-systemen en 27 ERP-systemen. Ook vertrouwde het bedrijf op 23 afzonderlijke ETL-tools (extractie, transformatie en laden) om gegevens van de ene database naar de andere te verplaatsen.

    Deze aanpak maakte het moeilijk om een eenduidig beeld te krijgen van de gegevens, zegt Singh. En nu GDPR (General Data Protection Regulation) van kracht is, moet AB InBev wereldwijd zicht hebben op haar gegevens, waarvan sommige over consumenten gaan en dus onderworpen zijn aan meerdere privacywetten.

    'We moeten die gegevens nog steeds standaardiseren en integreren, een ander aspect van onze data-uitdaging', zegt Singh.

    Singh heeft een maand lang een proof-of-concept uitgevoerd met Talend, een leverancier van data-integratiesoftware in de cloud, alvorens de leverancier te selecteren. AB InBev gebruikt Talend, in en cloud-gebaseerde ETL-tool voor het moderne tijdperk, om gegevens uit verschillende bronnen, waaronder cloud- en on-premise systemen en gegevens van IoT-apparaten, te extraheren en op te slaan in een Hortonworks Hadoop-datameer van Hortonworks datapakket dat gehost wordt op Microsoft Azure.

    Die gegevens worden vervolgens verwerkt en gearchiveerd voordat Talend ze naar een 'gouden laag' pendelt, waartoe datawetenschappers, operationele medewerkers en zakelijke gebruikers toegang hebben via datavisualisatietools. De herbruikbare gegevensbeheerarchitectuur van AB InBev omvat ook open source tools zoals Hive, Spark, HBase en Kafka, zegt Singh.

    De analyse van bier

    Waar AB InBev medewerkers ooit 70% tot 80% van hun tijd besteedden aan het lokaliseren van relevante gegevens over verschillende systemen heen, halen ze nu informatie op voor analyses uit één enkele bron. Het werk is aan de gang, maar Singh is er zeker van dat het dataplatform het personeel van AB InBev positioneert om meer kritische inzichten te verwerven over verkoop, supply chain management, marketing, human resources en andere business lines.

    AB InBev verzamelt bijvoorbeeld consumentengegevens van Nielsen en marktonderzoeken, en bijna-real-time gegevens van sociale media, om trends te analyseren en de juiste bieren en meer gerichte marketingcampagnes te leveren, met inbegrip van real-time coupons op maat van de consument op de plaats van aankoop. AB InBev kan ook de beste locatie in de winkel identificeren om bieren te verkopen, en ook hoe real-time evenementen kunnen worden gecreëerd om meer conversie te stimuleren.

    De onderneming gebruikt analyses om haar supply chain te optimaliseren. De IoT (Internet of Things) gegevens van RFID (Radio-Frequency Identification) apparaten helpen bij het opsporen van zogeheten 'connected packages' om de beste routes te vinden voor bezorgers en bij het regelen van de temperaturen in miljoenen bierkoelers over de hele wereld om ervoor te zorgen dat de producten van AB InBev op de optimale temperatuur worden opgeslagen en geserveerd.

    Singh laat toe dat AB InBev nog steeds werkt met technisch achterstanden die verouderde processen en technologieën omvatten, maar hij schrijft de succesvolle datatransformatie van AB InBev toe aan drie factoren: een cloud first-benadering, het gebruik van gegevens als zakelijk inzicht en de ontwikkeling van herbruikbare processen voor het snel extraheren en toegankelijk maken van gegevens.

    'Uiteindelijk gaat het erom de consument het beste bier en product te brengen', zegt Singh. 'Het gebruik van data is van cruciaal belang voor ons succes in ons streven'.

    Bron: CIO

  • Determining the feature set complexity

    Determining the feature set complexity

    Thoughtful predictor selection is essential for model fairness

    One common AI-related fear I’ve often heard is that machine learning models will leverage oddball facts buried in vast databases of personal information to make decisions impacting lives. For example, the fact that you used Arial font in your resume, plus your cat ownership and fondness for pierogi, will prevent you from getting a job. Associated with such concerns is fear of discrimination based on sex or race due to this kind of inference. Are such fears silly or realistic? Machine learning models are based on correlation, and any feature associated with an outcome can be used as a decision basis; there is reason for concern. However, the risks of such a scenario occurring depend on the information available to the model and on the specific algorithm used. Here, I will use sample data to illustrate differences in incorporation of incidental information in random forest vs. XGBoost models, and discuss the importance of considering missing information, appropriateness and causality in assessing model fairness.

    Feature choice — examining what might be missing as well as what’s included– is very important for model fairness. Often feature inclusion is thought of only in terms of keeping or omitting “sensitive” features such as race or sex, or obvious proxies for these. However, a model may leverage any feature associated with the outcome, and common measures of model performance and fairness will be essentially unaffected. Incidental correlated features may not be appropriate decision bases, or they may represent unfairness risks. Incidental feature risks are highest when appropriate predictors are not included in the model. Therefore, careful consideration of what might be missing is crucial.


    This article builds on results from a previous blog post and uses the same dataset and code base to illustrate the effects of missing and incidental features [1, 2]. In brief, I use a publicly-available loans dataset, in which the outcome is loan default status (binary), and predictors include income, employment length, debt load, etc. I preferentially (but randomly) sort lower-income cases into a made-up “female” category, and for simplicity consider only two gender categories (“males” and “females”). The result is that “females” on average have a lower income, but male and female incomes overlap; some females are high-income, and some males low-income. Examining common fairness and performance metrics, I found similar results whether the model relied on income or on gender to predict defaults, illustrating risks of relying only on metrics to detect bias.

    My previous blog post showed what happens when an incidental feature substitutes for an appropriate feature. Here, I will discuss what happens when both the appropriate predictor and the incidental feature are included in the data. I test two model types, and show that, as might be expected, the female status contributes to predictions despite the fact it contains no additional information. However, the incidental feature contributes much more to the random forest model than to the XGBoost model, suggesting that model selection may be help reduce unfairness risk, although tradeoffs should be considered.

    Fairness metrics and global importances

    In my example, the female feature adds no information to a model that already contains income. Any reliance on female status is unnecessary and represents “direct discrimination” risk. Ideally, a machine learning algorithm would ignore such a feature in favor of the stronger predictor.

    When the incidental feature, female status, is added to wither a random forest or XGBoost model, I see little change in overall performance characteristics or performance metrics (data not shown). ROC scores barely budge (as should be expected). False positive rates show very slight changes.

    Demographic parity, or the difference in loan default rates for females vs. males, remain essentially unchanged for XGBoost (5.2% vs.5.3%) when the female indicator is included, but for random forest, this metric does change from 4.3% to 5.0%; I discuss this observation in detail below.

    Global permutation importances show weak influences from the female feature for both model types. This feature ranks 12/14 for the random forest model, and 22/26 for XGBoost (when female=1). The fact that female status is of relatively low importance may seem reassuring, but any influence from this feature is a fairness risk.

    There are no clear red flags in global metrics when female status is included in the data — but this is expected as fairness metrics are similar whether decisions are based on an incidental or causal factor [1]. The key question is: does incorporation of female status increase disparities in outcome?

    Aggregated shapley values

    We can measure the degree to which a feature contributes to differences in group predictions using aggregated Shapley values [3]. This technique distributes differences in predicted outcome rates across features so that we can determine what drives differences for females vs. males. Calculation involves constructing a reference dataset consisting of randomly selected males, calculating Shapley feature importances for randomly-selected females using this “foil”, and then aggregating the female Shapley values (also called “phi” values).

    Results are shown below for both model types, with and without the “female” feature. The top 5 features for the model not including female is plotted along with female status for the model that includes that feature. All other features are summed into “other”.

    1 w4QsW620 U9Y 5z0G9xNRw

    Image by author

    First, note that the blue bar for female (present for the model including female status only) is much larger for random forest than for XGBoost. The bar magnitudes indicate the amount of probability difference for women vs. men that is attributed to a feature. For random forest, the female status feature increases the probability of default for females relative to males by 1.6%, compared to 0.3% for XGBoost, an ~5x difference.

    For random forest, female status ranks in the top 3 influential features in determining the difference in prediction for males vs. females, even though the feature was the 12th most important globally. The global importance does not capture this feature’s impact on fairness.

    As mentioned in the section above, the random forest model shows decreased demographic parity when female status is included in the model. This effect is also apparent in the Shapley plots– the increase due to the female bar is not compensated for by any decrease in the other bars. For XGBoost, the small contribution from female status appears to be offset by tiny decreases in contributions from other features.

    The reduced impact of the incidental feature for XGBoost compared to random forest makes sense when we think about how the algorithms work. Random forests create trees using random subsets of features, which are examined for optimal splits. Some of these initial feature sets will include the incidental feature but not the appropriate predictor, in which case incidental features may be chosen for splits. For XGBoost models, split criteria are based on improvements to a previous model. An incidental feature can’t improve a model based on a stronger predictor; therefore, after several rounds, we expect trees to include the appropriate predictor only.

    Demographic parity decreases for random forest can also be understood considering model building mechanisms. When a subset of features to be considered for a split is generated in the random forest, we essentially have two “income” features, and so it’s more likely that (direct or indirect) income information will be selected.

    The random forest model effectively uses a larger feature set than XGBoost. Although numerous features are likely to appear in both model types to some degree, XGBoost solutions will be weighted towards a smaller set of more predictive features. This reduces, but does not eliminate, risks related to incidental features for XGBoost.

    Is XGBoost fairer than Random Forest?

    In a previous blog post [4], I showed that incorporation of interactions to mitigate feature bias was more effective for XGBoost than for random forest (for one test scenario). Here, I observe that the XGBoost model is also less influenced by incidental information. Does this mean that we should prefer XGBoost for fairness reasons?

    XGBoost has advantages when both an incidental and appropriate feature are included in the data but doesn’t reduce risk when only the incidental feature is included. A random forest model’s reliance on a larger set of features may be a benefit, especially when additional features are correlated with the missing predictor.

    Furthermore, the fact that XGBoost doesn’t rely much on the incidental feature does not mean that it doesn’t contribute at all. It may be that only a smaller number of decisions are based on inappropriate information.

    Leaving fairness aside, the fact that the random forest samples a larger portion of what you might think of as the “solution space”, and relies on more predictors, may be have some advantages for model robustness. When a model is deployed and faces unexpected errors in data, the random forest model may be somewhat more able to compensate. (On the other hand, if random forest incorporates a correlated feature that is affected by errors, it might be compromised while an XGBoost model remains unaffected).

    XGBoost may have some fairness advantages, but the “fairest” model type is context-dependent, and robustness and accuracy must also be considered. I feel that fairness testing and explainability, as well thoughtful feature choices, are probably more valuable than model type in promoting fairness.

    What am I missing?

    Fairness considerations are crucial in feature selection for models that might affect lives. There are numerous existing feature selection methods, which generally optimize accuracy or predictive power, but do not consider fairness. One question that these don’t address is “what feature am I missing?”

    A model that relies on an incidental feature that happens to be correlated with a strong predictor may appear to behave in a reasonable manner, despite making unfair decisions [1]. Therefore, it’s very important to ask yourself, “what’s missing?” when building a model. The answer to this question may involve subject matter expertise or additional research. Missing predictors thought to have causal effects may be especially important to consider [5, 6].

    Obviously, the best solution for a missing predictor is to incorporate it. Sometimes, this may be impossible. Some effects can’t be measured or are unobtainable. But you and I both know that simple unavailability seldom determines the final feature set. Instead, it’s often, “that information is in a different database and I don’t know how to access it”, or “that source is owned by a different group and they are tough to work with”, or “we could get it, but there’s a license fee”. Feature choice generally reflects time and effort — which is often fine. Expediency is great when it’s possible. But when fairness is compromised by convenience, something does need to give. This is when fairness testing, aggregated Shapley plots, and subject matter expertise may be needed to make the case to do extra work or delay timelines in order to ensure appropriate decisions.

    What am I including?

    Another key question is “what am I including?”, which can often be restated as “for what could this be a proxy?” This question can be superficially applied to every feature in the dataset but should be very carefully considered for features identified as contributing to group differences; such features can be identified using aggregated Shapley plots or individual explanations. It may be useful to investigate whether such features contribute additional information above what’s available from other predictors

    Who am I like, and what have they done before?

    A binary classification model predicting something like loan defaults, likelihood to purchase a product, or success at a job, is essentially asking the question, “Who am I like, and what have they done before?” The word “like” here means similar values of the features included in the data, weighted according to their predictive contribution to the model. We then model (or approximate) what this cohort has done in the past to generate a probability score, which we believe is indicative of future results for people in that group.

    The “who am I like?” question gets to the heart of worries that people will be judged if they eat too many pierogis, own too many cats, or just happen to be a certain race, sex, or ethnicity. The concern is that it is just not fair to evaluate individual people due to their membership in such groups, regardless of the average outcome for overall populations. What is appropriate depends heavily on context — perhaps pierogis are fine to consider in a heart attack model, but would be worrisome in a criminal justice setting.

    Our models assign people to groups — even if models are continuous, we can think of that as the limit of very little buckets — and then we estimate risks for these populations. This isn’t much different than old-school actuarial tables, except that we may be using a very large feature set to determine group boundaries, and we may not be fully aware of the meaning of information we use in the process.

    Final thoughts

    Feature choice is more than a mathematical exercise, and likely requires the judgment of subject matter experts, compliance analysts, or even the public. A data scientist’s contribution to this process should involve using explainability techniques to populations and discover features driving group differences. We can also identify at-risk populations and ask questions about features known to have causal relationships with outcomes.

    Legal and compliance departments often focus on included features, and their concerns may be primarily related to specific types of sensitive information. Considering what’s missing from a model is not very common. However, the question, “what’s missing?” is at least as important as, “what’s there?” in confirming that models make fair and appropriate decisions.

    Data scientists can be scrappy and adept at producing models with limited or noisy data. There is something satisfying about getting a model that “works” from less than ideal information. It can be hard to admit that something can’t be done, but sometimes fairness dictates that what we have right now really isn’t enough — or isn’t enough yet.

    Author: Valerie Carey

    Source: Towards Data Science

  • Distinguishing between advanced analytics and business intelligence

    Distinguishing between advanced analytics and business intelligence

    Advanced analytics and business intelligence (BI) have more or less the same objective: use data to drive insights that inform business strategy. So what’s the difference? 

    What is business intelligence? 

    Business intelligence is an umbrella term for software and services that provide comprehensive yet straightforward insights about an organization’s current state. Think routine reporting or dashboarding, where data is clearly legible for stakeholders to understand month by month. Examples of business intelligence use cases abound, some of which include unifying data to better track marketing leads or to manage shipping operations across a fleet of trucks. Business intelligence is by no means easy, but it is grounded in practical, everyday uses of data. 

    What is advanced analytics? 

    Advanced analytics employs the use of sophisticated tools and techniques that surpass traditional business intelligence capabilities. Like business intelligence, it is a wide-reaching term that involves many methods and lends itself to many possible use cases.

    Advanced analytics is not meant to replace business intelligence but to augment its efforts. It strives to ask deeper questions of the data, generating insights that not only indicate how the business is currently performing but where its future is headed. If we consider that business intelligence largely aims to point out strengths and weaknesses in current business processes, advanced analytics has the potential to make recommendations and predictions as to how to steer the organization forward. 

    Examples of 5 advanced analytics techniques 

    Let’s take a closer look at some of the techniques that fall under the category of advanced analytics. Rarely will organizations need to use all of these techniques at once as a part of their advanced analytics integration; rather, they are merely some of the many tools in the toolkit of a data professional. 

    1. Forecasting

    Forecasting is the technique of analyzing historical data to predict future outcomes. It considers prior trends to recommend how organizations should plan ahead, such as stocking more inventory for a historically popular sales day. Forecasts can be extremely accurate, but their reliability depends upon the relevance and availability of historical data, as well as the time period to be forecasted.

    2. Machine learning

    Machine learning is the process of training a computer to predict outcomes without it being specifically programmed to do so. Machine learning models are built to model the desired behavior, and as the model is fed more and more training data, its accuracy in predicting outcomes increases. Data, and lots of it, is the key to effective machine learning models.

    3. Data mining and pattern matching

    Data mining is the process of uncovering patterns in large batches of raw data for further analysis. Analysts often don’t know what’s in data warehouses or what they should be looking for; data mining techniques, such as pattern matching, help source the right data from data warehouses based upon connections in the data.

    4. Semantic analysis

    Semantic analysis is the act of determining meaning from text data. By way of semantic analysis, computers can “read” full documents by analyzing its grammatical structure and the relationship of individual words. The technique is particularly useful for marketing teams to be able to analyze social media data or for customer service teams to better understand the effectiveness of online customer support.

    5. Complex event processing

    Complex event processing is the act of aggregating huge volumes of data to help determine the cause-and-effect relationships for any given event. By matching incoming events against a pattern, complex event processing can shed light as to what is happening.

    Benefits of advanced analytics

    It’s widely recognized that an advanced analytics integration offers a competitive edge. Just a few of the benefits that advanced analytics can deliver include: 

    • Better decision-making
      Advanced analytics delivers valuable insights that allow organizations to make better decisions, adjust their company strategy, and plan for the future. 
    • Saved costs
      Identifying overspend or leaking costs through advanced analytics can have a huge impact on the budget over time.
    • Increased innovation
      Through advanced analytics, organizations have developed innovative new products, processes, or sales/marketing strategies that have given them a leg up from the competition.

    Challenges of advanced analytics

    Many organizations encounter roadblocks along their advanced analytics journey, which prevent them from fully realizing these benefits. According to a 2018 McKinsey survey, “fewer than 20 percent [of companies] have maximized the potential and achieved advanced analytics at scale.” Some of the top challenges of advanced analytics include:

    • Cost
      Advanced data analytics will prove its ROI over time, but the upfront costs can be rather costly. Investing in infrastructure and talent, as well as the time required for data strategy and deployment, can be intimidating for organizations to take on.
    • Working with data from multiple sources
      Effective analytics should employ as many data sources as necessary, but gathering and integrating all of these data sources can be challenging.
    • Inaccessible data
      Even after the appropriate amount of data is gathered and centralized, if that data isn’t made accessible to the analysts that need to use it, it will serve little value to the organization.
    • Skills shortage
      Data scientists and data engineers are costly resources and difficult to source. Though user-friendly technologies have lowered the barrier to advanced analytics, many organizations still want a foundational data science team.
    • Poor quality data
      Harvard Business Review called poor quality data “enemy number one” to machine learning initiatives, and that extends to all facets of advanced analytics. If data hasn’t been vetted to meet data quality standards or properly prepared for the requirements of the analysis at hand, it will only lead to faulty or misleading insights. 

    Data preparation & advanced analytics

    Data preparation accounts for up to 80% of total analytic time. It’s where analysts can encounter a minefield of analytic challenges. But, it also presents the biggest opportunity for improvement. Succeed at data preparation and odds are, you’ll see far less advanced data analytics challenges. 

    Traditional data preparation methods like extract, transform, and load (ETL) tools or hand-coding are time-consuming and bar analysts from the process of transforming their own data. Recently, organizations have invested in modern data preparation platforms as a part of their advanced analytics integration, which allows organizations to:

    • Easily connect to a diverse range of data sources. 
    • Identify data quality issues through a visual interface. 
    • Involve non-technical analysts in the process of preparing data. 
    • Integrate structured and unstructured data of any size. 
    • Reduce the total time spent preparing data by up to 90%. 

    Author: Matt Derda

    Source: Trifacta

  • Doing market research using data? Don't forget to engage with your customers

    Doing market research using data? Don't forget to engage with your customers

    The ability to gather and process intimate, granular detail on a mass scale promises to uncover unimaginable relationships within a market. But does “detail” actually equate to “insight”?

    Many decision makers clearly believe it does. In Australia, for instance, the big four banks Westpac, National, ANZ and Commonwealth are spending large on churning through mountains of customer data that relate one set of variables — gender, age, and occupation, for instance — to a range of banking products and services. Australia’s largest bank, the Commonwealth, has announced its big data push.

    Like the big banks, Australia’s two largest supermarket chains, Woolworths and Coles, are scouring customer data and applying the massive computer power now available, and needed, with statistical techniques in the search for “insights.” This could involve the combination of web browsing activity, with social media use, with purchasing patterns and so on — complex analysis across diverse platforms.

    While applying correlation and regression analysis (among other tools) to truckloads of data has its place, I have a real concern that — once again — CEOs and senior executives will retreat to their suites satisfied that the IT department will now do all the heavy lifting when it comes to listening to the customer.

    Data’s Deceptive Appeal

    To peek into the deceptive appeal of numbers, let’s review how one business hid behind its data for years.

    Keith is the CEO of a wealth management business focused on high-net-worth individuals. It assists them with their investments by providing products, portfolio solutions, financial planning advice and real estate opportunities.

    Like its competitors Keith’s company employed surveys to gather data on how the business was performing. But Keith and his executive team came to realize that dredging through these details was not producing insights that management might use in strategy development.

    So, Keith’s team decided on a different path. One that really did involve listening to the customer. They conducted a series of client interviews structured in a way that allowed the customer to do the talking and the company to do the listening. What Keith and his executives discovered really shocked them.

    The first was that their data was based on nonsense. This came about because the questions they’d been asking were built on managers’ perceptions of what clients needed to answer. They weren’t constructed on what clients wanted to express. This resulted in data that didn’t reflect clients’ real requirements. The list of priorities obtained via client interviews compared to management’s assumed client priorities coincided a mere 50 percent of the time.

    Keith’s business is not alone in this as studies have shown that big data is often “precisely inaccurate.” A study reported by Deloitte found that “more than two-thirds of survey respondents stated that the third-party data about them was only 0 to 50 percent correct as a whole. One-third of respondents perceived the information to be 0 to 25 percent correct.”

    In Keith’s case this error was compounded when it came to the rating of these requirements. For example, the company believed that older clients wouldn’t rank “technology” (digital and online tools) as high on their list of requirements. However, in the interviews they discovered that while these older clients weren’t big users of technology themselves, many cared about it a great deal. This was because they had assistants who did use it and because they considered having state-of-the-art technology a prerequisite for an up-to-date business.

    What Keith and his team also discovered, to their surprise, was how few interviews it took to gain genuine insight. Keith reports that “we needed around 18 to 20 clients to uncover most of the substantive feedback. We thought we’d need many more.” What Keith has encountered here is saturation; a research term referring to the point when you can stop conducting interviews because you fail to hear anything new.

    Listening to the Customer

    Engaging with your customers may not be as exciting and new as investing in “big data.” But it does have a solid track record of success. Cast your minds back to a historic time in Toyota’s history.

    When Toyota wanted to develop a luxury car for the United States, its team didn’t hunker down in Tokyo to come up with the perfect design. Nor did it sift through data obtained from existing Toyota customers about current Toyota models. Instead, it sent its designers and managers to California to observe and interview the target customer — an American, male, high-income executive — to find out what he wanted in a car. This knowledge, combined with its undoubted engineering excellence, resulted in a completely new direction for Toyota: a luxury export to the United States. You will know it better as the Lexus. Listening to the customer is now embedded in Toyota’s culture.

    Listening to the customer is also a fundamental component of Adobe’s culture. The company speaks of a “culture of customer listening” and has produced a useful set of guidelines on how to tune in to customers. Elaine Chao, a Product Manager with the company, has expressed it this way: “Listening is the first step. We try to focus on what customers want to accomplish, not necessarily how they want to accomplish it.”

    So, provided your data isn’t “precisely inaccurate” employ modern computer power to examine patterns in your customers’ buying behavior. But understand big data’s limitations. The data is historic and static. It’s historic because it’s about the past. Your customers have most likely moved on from what the data captures. And it’s static because, as with any computer modeling, it can never answer a question that you didn’t think to ask.

    Real insights come from seeing the world through someone else’s eyes. You will only ever get that by truly engaging with customers and listening to their stories.

    Author: Graham Kenny

    Source: Harvard Business Review

  • Down to Business: Seven tips for better market intelligence

    business-analysisMaking decisions about product and service offerings can make or break your success as a business. Business owners, executives and product managers need good information and data to make the most informed product decisions.

    This critical information about markets, customers, competitors and technology is called market intelligence. Market intelligence combined with analysis provides market insight and allows better decision making.

    Here are seven tips for better market intelligence:

    1. Develop a process: Your ability to harness, manage and analyze good data is vital to your success. Assure you develop a process for gathering, storing and utilizing market intelligence. Take the time to train your team and invest in a robust market intelligence process. It's an investment with an excellent return.

    2. Gather data when you lose: Often when a company loses an order we ask the salesperson what happened and they offer an opinion. It's important to drill down and really understand why you lost an important order. I recall a situation years ago where a salesperson's opinion was very different from what ultimately was the actual reason we lost this large order. Understanding the real reason for the loss assures you are far more likely to choose correct strategies to win the order in the future. Trust, but verify.

    3. Attend trade shows: You should attend trade shows and use them as a fact-finding mission. Trade shows are like one-stop shopping for market intelligence. There are industry analysts, suppliers, customers and industry media all in one location. Use your time wisely to engage with as many people as possible and utilize your listening skills. It's always best to plan ahead for trade shows, to make the best use of your limited time there. Make sure you stay at the hotel suggested by the show organizers. The "show hotel" may cost a little more than other hotels in the area, but you will have far more opportunities to gather information. You can also consider hiring someone, who does not work for your company, to gather information at trade shows, or speak with an industry analyst. This "stealth mode" of gathering market intelligence can provide added benefits.

    4. Take a customer to lunch: Understanding your customers, their challenges and their perception is one of the best ways to gain market insight. Ultimately it is your customer's perceptions that determine your brand positioning. Spending time with your customers, listening to them and acting on these insights, can provide you with an amazing competitive advantage.

    5. Build a database: Data can be hard to find as time moves forward and people leave an organization. It's worthwhile to build a central database of your market intelligence. By indexing this data it becomes easy for your product managers and executives to have access to the best information when making decisions.

    6. Assure you have good data: It takes good, accurate data for the best results; never forget this point. Good data means better decisions. Accuracy can be improved by using multiple sources and considering how any specific source may be biased. Bad information leads to poor decisions. Ensure you are gathering good data.

    7. Train your team: You cannot gather good data that provides market intelligence unless you have a team of professionals that understands how to gain the best market insights. Assure you have a team that is trained not only on how to gather market intelligence, but how to analyze and use the data for better decision making. As an example we offer a product management boot camp that covers this subject in detail, among others.

    Developing market intelligence takes work as well as a robust methodology. It's not a one-time event, but a continuous process. The absence of good data leads to suboptimal decisions. Good data leads to better decision-making and success for your organization.

  • Drawing value from data in manufacturing companies

    Drawing value from data in manufacturing companies

    The modern manufacturing world is a delicate dance, filled with interconnected pieces that all need to work perfectly in order to produce the goods that keep the world running. In this article, we explore the unique data and analytics challenges manufacturing companies face every day.

    The world of data in modern manufacturing

    With the datasphere growing exponentially to an expected volume of 175 zettabytes by 2025 , it stands to reason that manufacturing is experiencing the radical impact of this growth, just as much as other business areas.  Manufacturing companies that adopted computerization years ago are already taking the next step as they transform into smart data-driven organizations.

    It’s easy to see why. Manufacturing constantly seeks ways to increase efficiency, reduce costs, and unlock productivity and profitability. Data is a critical tool for identifying where and how that can be done in any manufacturing process. Whatever your department, whether you’re concerned with production, inventory, warehousing, or transportation and other logistics, knowing precisely how your operation is running nd where it can be improved is essential to improving your bottom line.

    From a practical perspective, the computerization and automation of manufacturing hugely increase the data that companies acquire. And cloud data warehouses and data lakes give companies the capability to store these vast quantities of data. However, it’s only useful if you can accurately analyze it and get the insights you need to enhance your business.

    Modern factories are full of machines, sensors, and devices that make up the Internet of Things. All of them generate a trail of performance-tracking data. The challenge for manufacturers is to capture all this data in real-time and use it effectively. To achieve this, they need a BI and analytics platform that can transform the data into actionable insights for their business. And a substantial proportion of this data can be gathered and organized by analytics embedded at the network’s edge, within the manufacturing equipment itself. As a result, you can get insights faster, at the very place that they’re generated, and without the need for IT teams to gather, analyze and generate reports, which is time-consuming and uses resources that could be better applied elsewhere.

    Here are three key areas where data adds value to the manufacturing process to give companies a competitive edge.

    How data enhances product development

    Every part of a business generates big data. Analyzing data from disparate sources to identify relationships between processes, causes, and effects is part of what helps a business hone its product development strategy, manufacturing processes, the marketing and sales of those products, and the logistics of supply chain and delivery.

    We asked Christine Quan, Sisense BI Engineer in sales, how she thinks data helps product development, and she said: 

    'Surveying market data enables you to have a better understanding of customer needs and can also be a way to gather feedbacj for initial product ideas'.

    Indeed, data enables a company to understand its customers better. With this information, it can develop new products or improve existing products to meet customers’ needs. At the same time, data can inform a company about potential markets so it can judge how much risk an innovation carries. Consequently, this risk can be mitigated in the product development process, because the more a manufacturer knows before production, the less of a gamble it is. Furthermore, actionable insights derived from data both before and during production can be used to plan and hone the manufacturing process, and enhance many operational aspects.

    Take BraunAbility, for example. The company manufactures and installs adaptations to personal and commercial vehicles that make them wheelchair accessible. Using a BI and analytics platform, BraunAbility has improved its understanding of customer preferences in different markets. Data has given the company the insights to drive production of the most in-demand products, make informed decisions about what it keeps in stock and even what product discounts should be offered to impact the sales rate positively. With this new information, BraunAbility sees better profit margins across the board.

    Data improves and streamlines production quality control

    The analysis of big data sets generated in the manufacturing process can minimize production defects and keep quality standards high, while at the same time increasing efficiency, wasting less time, and saving more money.

    Embedded analytics are particularly valuable in terms of quality control and optimizing manufacturing efficiency. Computerized and automated monitoring systems, far more sensitive and accurate than the human eye, capture discrepancies more accurately and more cheaply, around the clock. This continuous, smart, machine-based scrutiny significantly decreases the number of tests essential to maintain quality parameters. Data can also be used to calculate the probabilities of delays, to identify, develop and implement backup plans.

    Embedded analytics are also faster and more autonomous than more traditional data analysis. With embedded analytics, it’s no longer necessary for data analysts to feed the data lake to the stand-alone cloud data warehouse, then mash up the data and verify the results. Analytic technology embedded within machinery can do the job at the point at which the data is generated. So, less intervention from data analysts is necessary, decisions can be influenced directly by data and processes are accelerated, using fewer resources.

    Effectively, Big Data enables manufacturers to improve and streamline their processes across production and quality control. As Christine Quan explains: 

    'Setting up a comprehensive data feedback loop enables you to get real-time information about all aspects of your manufacturing processes, so you can rapidly calibrate them to boost production efficiency'.

    his is particularly pertinent to asset-heavy industries such as pharmaceuticals, electronics, and aerospace parts manufacturing, in which superior asset management is critical for efficient and profitable operation.

    Certain processing environments like pharmaceuticals, chemicals, and mining are prone to considerable swings in variability. Coupled with the number and complexity of elements in the production processes in these industries, such companies can find it challenging to maintain the stability and uniformity of processes. They can benefit most from advanced analytics because it provides a highly granular approach to diagnosing and correcting process flaws.

    McKinsey Company gives the example of the biopharmaceutical industry that includes the manufacture of vaccines, hormones, and blood components. They are made using live, genetically engineered cells, and production often involves monitoring hundreds of variables to ensure the purity of the ingredients and the substances being made. Two batches of a particular substance, produced using the same process, can still considerably vary in yield without explanation. This can detrimentally affect capacity and product quality and can attract intensified attention from regulators.

    Advanced data analytics can overcome this issue without incurring huge costs. By segmenting the manufacturing process into clusters of related production activities, gathering data about these, and analyzing the data to show interdependencies, it’s possible to identify stages in the process that influence the variability in yield. Address those and the yield can increase by a value of millions of dollars per product.

    Improving the supply chain and mitigating its risk

    Major manufacturing processes require a lot of raw materials and components that together form a complex supply chain. Inevitably, the larger and more complex the supply chain, the riskier and more prone to problems it is. Many supply chains struggle to gather and make sense of the huge volume of data that they generate. However, Christine points out that: 

    'Having the right data can help you de-risk decisions by providing a more holistic view of your supply chain'.

    That’s because big data analytics and cognitive technologies like machine learning bring visibility to supply chains, and help manufacturers manage them, mitigate the risks, offer a better customer experience, and therefore give them a competitive edge.

    Analyzing data can identify where and how problems are occurring and can even predict where delays and other issues might occur. So, robust analytics allows manufacturers to develop and implement contingency plans that enable them to harmonize the supply chain with manufacturing requirements, sustain the pace of production and maintain maximal efficiency, essential for the ongoing performance of your business.

    Making the changes work for manufacturing

    Of course, manufacturing predates the advent of smart data analytics and in some cases, it takes time for it to catch up with emerging trends. Nevertheless, manufacturers know that to stay ahead they need to adopt new processes and technologies involving data, analytics, AI and machine learning.

    These technologies can drive improvements in modern manufacturing environments that face the challenges of process complexity, variability, capacity, and speed. By applying smart data techniques to the manufacturing process, companies can meet and exceed demand and the requirements of the market, anticipate and avoid possible risks, minimize waste and reduce problems, and maintain high quality standards. Harnessing the power of big data and implementing the right analytics technology will ensure that manufacturers achieve their business goals more efficiently and cost-effectively than ever before. 

    Source: Sisense

    Author: Adam Murray

  • Ethical Intelligence: can businesses take the responsibilty?

    Ethical Intelligence: can businesses take the responsibilty?

    Adding property rights to inherent human data could provide a significant opportunity and differentiator for companies seeking to get ahead of the data ethics crisis and adopt good business ethics around consumer data.

    The ability for a business to operate based on some amount of intelligence is not new. Even before business owners used manual techniques such as writing customer orders in a book or using calculators to help forecast how many pounds of potatoes might be needed to stock up for next week's sales, there were forms of "insight searching." Enterprises are always looking for operational efficiencies, and today they are gathering more intelligence exponentially.

    A significant part of business intelligence is understanding customers. The more data a company has about its current or prospective customers' wants, likes, dislikes, behaviors, activities, and lifestyle, the more intelligence that business can generate. In principle, more data suggests the possibility of more intelligence.

    The question is: are most businesses and their employees prepared to be highly intelligent? If a company were to reach a state where it has significant intelligence about its customers, could it resist the urge to manipulate them?

    Suppose a social media site uses data about past activities to conclude that a 14-year-old boy is attracted to other teenage boys. Before he discovers where he might be on the gay/straight spectrum, could t social media executives, employees, and/or algorithms resist the urge to target him with content tagged for members of the LGBTQ community? If they knowingly or unknowingly target him with LGBTQ-relevant content before the child discovers who he might be, is that behavior considered ethical?

    Looking for best practices

    Are businesses prepared to be responsible with significant intelligence, and are there best practices that would give a really intelligent business an ethical compass?

    The answer is maybe, leaning toward no.

    Business ethics is not something new either. Much like business intelligence, it evolved over time. What is new though, is that ethics no longer only have to be embedded into humans that make business decisions. It must also be embedded in automated systems that make business decisions. The former, although imperfect, is conceivable. You might be able to hire ethical people or build a culture of ethics in people. The latter is more difficult. Building ethics into systems is neither art nor science. It is a confluence of raw materials, many of which we humans still don't fully understand.

    Business ethics has two components. One is the aforementioned ethics in systems (sometimes called AI ethics) that is primarily focused on the design of algorithms. The other component of business ethics is data ethics, which can be measured from two dimensions: the algorithm and the raw material that goes into the algorithm (that is, the data).

    AI ethics is complex, but it is being studied. At the core of the complexity are human programmers who are usually biased and can have varying ethical frameworks and customs. They may create potentially biased or unethical algorithms.

    Data ethics is not as complex but is not widely studied. It covers areas such as consent for the possession of data, authorization for the use of data, the terms under which an enterprise is permitted to possess and use data, whether the value created from data should be shared with the data's source (such as a human), and how permission is secured to share insights derived from data.

    Another area of data ethics is whether the entire data set is representative of society. For example, is an algorithm determining how to spot good resumes being trained with 80 percent resumes from men and just 20 percent from women?

    These are large social, economic, and historical constructs to sort out. As companies become exponentially more intelligent, the need for business ethics will increase likewise. As a starting point, corporations and executives should consider consent for and authorization of data used in business intelligence. Was the data collected with proper consent? Meaning: does the user really know that their data is being monetized or was it hidden in a long terms and conditions agreement? What were the terms and conditions? Was the data donated, was it leased, or was it "sort of lifted" from the user?

    Many questions, limited answers.

    The property rights model

    Silicon Valley is currently burning in a data ethics crisis. At the core is a growing social divide about data ownership between consumers, communities, corporations, and countries. We tend to anticipate that new problems need new solutions. In reality, sometimes the best solution is to take something we already know and understand and retrofit it into something new.

    One emerging construct uses a familiar legal and commercial framework to enable consumers and corporations to find agreement around the many unanswered questions of data ownership. This construct uses the legal and commercial framework of property as a set of agreements to bridge the growing divide between consumers and corporations on the issues of data ownership, use, and consideration for value derived from data.

    If consumer data is treated as personal property, consumers and enterprises can reach agreement using well-understood and accepted practices such as a title of ownership for one's data, track and trace of data as property, leasing of the data as property, protection from theft, taxation of income created from said data, tax write-offs for donating the data, and the ability to include data property as part of one's estate.

    For corporations and executives, with increasing business intelligence comes increasing business ethics responsibilities.

    What is your strategy?

    Author: Richie Etwaru

    Source: TDWI

  • ETL or ELT? Five arguments in favor of ETL

    ETL or ELT? Five arguments in favor of ETL

    The ETL and data integration market are highly mature, but it still sees new trends emerge, especially when underlying platforms, technologies, and tools see innovation.  We outlined one new trend – ETL++ – in this article on Medium.

    Another trend sweeping the data integration community is ELT – extract, load, and transform.  It is a new pattern of data integration whereby data is first extracted and loaded into a cloud data warehouse (CDW), then it is transformed inside the CDW.  This trend has emerged specifically in the cloud, taking advantage of powerful new CDW platforms.

    At the same time as the ELT model emerged, ETL evolved into ETL++, and in the process, becoming “code-free” and “schema-free” and having re-use and extensibility.  This eliminated mundane and error-prone coding tasks, making ETL data pipelines far more reliable, and made data pipeline creation, testing, and deployment far easier and more efficient.

    There are advantages and disadvantages of each approach. Here we will explore five arguments in favor of using an ETL approach.

    1. Less Data Replication

    With ELT, you are replicating all the raw data you need for analysis into your CDW, then transforming it into an analytics-ready form. For many organizations, this will triple or even quadruple the amount of data that is being replicated for analytics.

    With ETL, you transform the data in-flight, requiring no extra data movement and storage.  All you load into your CDW is analytics-ready data. As you will see in upcoming sections, having less data replication can eliminate potential security risks, reduce your governance burden, and lower the overall cost of your analytics. In general, the less you replicate data, the more efficient your data and analytics stack is.

    2. Easier, More Effective Data Security and Governance

    The more data you replicate for analytics, the greater the burden and risk of securing and governing this data. All the raw data that is moved into the CDW will need to be secured and governed, adding extra work to already taxed data security and governance teams.

    With ETL, you are only moving analytics-ready data into the CDW, reducing the potential for security holes and lowering governance risk and effort. Using an ETL platform with highly evolved security and governance features ensures consistency and high levels of data security and governance across the entire data lifecycle – extract, transformation, and loading.

    3. Broader Data Transformation Capabilities

    SQL has evolved over the years to support much of the standard data transformation capabilities. What SQL doesn’t have can be supplemented with user-defined functions (UDFs).  But SQL is only designed to work on highly structured data (tables) and can require complex coding for more sophisticated transformation.

    The code-free approach of modern ETL tools makes it fast and easy to graphically apply a wide range of functions that can also work on more modern, complex data formats. Spectrum offers over 300 functions, some extremely sophisticated, and can be applied in a single click, such as extracting meaning or sentiment from text fields, completely re-organizing datasets, slicing and dicing data into intricate buckets, or encoding data for AI and machine learning models.

    4. More Reliable Data Pipelines

    With modern, code-free ETL tools, all data pipelines can be defined and deployed without writing one line of code. Not only does this make the definition and deployment process faster and more efficient, but it also eliminates potential coding errors, making your data pipelines more reliable.

    The schema-free approach, graphical tools, and full data lineage of modern ETL tools such as Spectrum also reduce potential requirements mismatches during projects. Spectrum also provides end-to-end data pipeline monitoring and auditability, including the T process, for data observability. 

    5. Transparent Costs

    As we’ve discussed, in an ELT model, raw data is stored in a normalized schema in the CDW, then transformed into an analytics-ready form, such as a denormalized materialized table or a multi-dimensional aggregated table. This will increase:

    • CDW storage costs as both raw and analytics-ready data are stored
    • CDW compute costs as transformation queries are pushed down into the CDW

    Transforming from a denormalized schema to a normalized analytics-ready one requires many-way joins and unions, and aggregations, which are extremely “expensive” compute operations in a CDW. These extra hidden costs will increase your monthly CDW bill, and there are additional costs of teams trying to write, debug, and deploy SQL transformations.


    There are advantages and disadvantages of both ETL and ELT models for data integration. For instance, some organizations are very efficient at using SQL for transforming data and know how to effectively manage their CDW usage.

    So, why not hedge your bets and use a data integration platform that supports BOTH an ETL and ELT model. This way, you can choose the best approach per data pipeline based on its’ unique needs.

    Author: John Morrell

    Source: Datameer

  • Facebook to face lawsuit regarding 'worst security breach ever'

    Facebook to face lawsuit regarding 'worst security breach ever'

    Facebook Inc. failed to fend off a lawsuit over a data breach that affected nearly 30 million users, one of several privacy snafus that have put the company under siege.

    The company’s disclosure in September that hackers exploited several software bugs to obtain login access to accounts was tagged as Facebook’s worst security breach ever. An initial estimate that as many as 50 million accounts were affected was scaled back weeks later.

    A federal appeals court in San Francisco, rejected the company’s request to block the lawsuit on June 21 , saying claims against Facebook can proceed for negligence and for failing to secure users’ data as promised. Discovery should move 'with alacrity' for a trial, U.S. District Judge William Alsup said in his ruling. He dismissed breach-of-contract and breach-of-confidence claims due to liability limitations. Plaintiffs can seek to amend their cases by July 18.

    'From a policy standpoint, to hold that Facebook has no duty of care here ‘would create perverse incentives for businesses who profit off the use of consumers’ personal data to turn a blind eye and ignore known security risks', Judge Alsup said, citing a decision a separate case.

    The world’s largest social network portrayed itself as the victim of a sophisticated cyber-attack and argued that it isn’t liable for thieves gaining access to user names and contact information. The company said attackers failed to get more sensitive information, like credit card numbers or passwords, saving users from any real harm.

    Attorneys for users called that argument 'cynical', saying in a court filing that Facebook has 'abdicated all accountability' while 'seeking to avoid all liability' for the data breach despite Chief Executive Officer Mark Zuckerberg’s promise that the company would learn from its lapses. The case was filed in San Francisco federal court as a class action.

    Facebook didn’t immediately respond to a request for comment.

    The Menlo Park, California-based company faces a slew of lawsuits and regulatory probes of its privacy practices after revelations in early 2018 that it allowed the personal data of tens of millions of users to be shared with political consultancy Cambridge Analytica. As lawmakers have focused greater scrutiny on the company, Zuckerberg called for new global regulation governing the internet in March, including rules for privacy safeguards.

    The case is Echavarria v. Facebook Inc., 3:18-cv-05982 , U.S. District Court, Northern District of California (San Francisco).

    Author: Kartikay Mehrotra and Aoife White

    Source: Bloomberg

  • Five challenges to overcome in order to improve sales productivity

    Five challenges to overcome in order to improve sales productivity

    Many of the sales leaders we work with say they need to find ways to make their sales team more productive. It’s no wonder, given the constant pressure to deliver results. Sales leaders also recognize that meeting sales goals consistently requires a broad-based contribution from everyone on the team — not just a few star players.

    How do you foster that contribution and help sales teams be truly productive? A starting point is to understand some of the key productivity challenges that today’s sales teams face:

    The complexity of today’s buyer 

    Today’s B2B buyers are more self-directed than ever before, and they’re likely to base much of their decision-making on information they find online. When buyers do interact with sales reps, they expect a continuous experience — which means that to prospect productively, reps need to know where buyers are in the journey and provide the information they need to progress.

    Lack of comfort with virtual selling

    54% of sales reps in Forrester’s latest sales activity study said that losing the ability to meet with clients face-to-face has hurt their ability to meet quota. Though sales teams continue to hone their virtual selling skills, achieving the same level of proficiency as in an in-person environment takes practice. Virtual selling will be the norm in many selling scenarios even after the pandemic, so reps need to build these capabilities to be productive and effective.

    Not using technology to its full potential

    The proliferation of sales technology in recent years can leave sales leaders feeling unsure of where to begin. It’s imperative to work with sales operations to choose the tools that will yield the greatest productivity gains for your organization. Whether it’s automating capture of buyer interactions or leveraging revenue operations platforms that centralize data and analytics, truly understanding what’s available and zeroing in on what will best serve your team can be a game changer.

    Time-draining administrative tasks

    Our latest sales activity study data shows that sales reps spend, on average, more than one-quarter of their working hours on administrative tasks such as internal meetings, order booking, and expense reporting. That’s slightly more than the time they spend in the most productive manner: directly selling to prospects. Finding opportunities to minimize unproductive work is a key to improving team performance.

    Not having the right content

    Our sales activity studies have consistently shown that finding content and information is a significant productivity obstacle for sales teams. Without easy access, reps will miss opportunities to provide information that could help move prospects closer to a sale. Steps such as consolidating content into a centralized repository and categorizing it by buyer journey phase can contribute to greater sales success.

    Working through these sales productivity challenges is essential to enabling reps to perform as effectively as possible!

    Author: Phil Harrell

    Source: Forrester

  • Four Drivers of Successful Business Intelligence

    BICompanies across industries face some very common scenarios when it comes to getting the most value out of data. The life science industry is no exception. Sometimes a company sets out to improve business intelligence (BI) for a brand, division or functional area. It spends many months or years and millions of dollars to aggregate all of the data it thinks it needs to better measure performance and make smart business decisions only to yield more data. In another familiar scenario, a team identifies critical questions the BI system can't answer. Again, months and millions go into development. But by the time the system goes live, market and/or company conditions have changed so much that the questions are no longer relevant.

    Building Better Business Intelligence Systems
    Today's challenges cannot be met by throwing more dollars into the marketing budget or by building more, or bigger, data warehouses. Ultimately, navigating today's complexities and generating greater value from data isn't about more, it's about better. The good news is that other industries have demonstrated the power and practicality of analytics at scale. Technology has evolved to overcome fragmented data and systems. We are now observing a real push in life sciences for a BI capability that's smarter and simpler.

    So how do we build better business intelligence platforms? In working with life sciences companies around the globe, IMS Health has observed a recurring journey with three horizons of business intelligence maturity: alignment of existing KPIs, generation of superior insights and customer-centric execution (see Figure 1).

    What does it take to advance in business intelligence maturity?
    No matter where a company currently stands, there are four fundamental steps that drive BI success: the ability to align business and information management strategy, improving information management systems integration and workflow, engineering BI systems to derive more value and insights from data, and making the most of new cloud computing technologies and Software-as-a-Service (SaaS) models for delivery.

    Step 1: Align Business and Information Management Strategy
    Many IT and business leaders recognize that the traditional "build it and they will come" mentality can no longer sustain future growth in agile and cost-efficient ways. To be successful, companies need to focus upfront on developing an information management strategy that begins with the business in mind. Through a top-down and upfront focus on critical business goals, drivers and pain points, companies can ensure that key insights are captured to drive development of commercial information management strategies that align with prioritized business needs. Leading organizations have achieved success via pilot-and-prove approaches that focus on business value at each step of the journey. To be successful, the approach must be considered in the context of the business and operational strategies.

    Step 2: Improving Information Management Systems Integration and Workflow
    Although technology systems and applications have proliferated within many organizations, they often remained siloed and sub-optimized. Interoperability is now a key priority and a vehicle for optimizing commercial organizations-improving workflow speed, eliminating conflicting views of the truth across departments and paring down vendor teams managing manual data handoffs. Information and master data management systems must be integrated to deliver an integrated view of the customer. When optimized, these systems can enable advanced BI capabilities ranging from improved account management and evolved customer interactions (i.e. account-based selling and management, insights on healthcare networks and relationships with influencers and KOLs) to harnessing the power of big data and demonstrating value to all healthcare stakeholders.

    Step 3: Engineering BI Systems to Derive More Value and Insights from Data
    Life sciences companies compete on the quality of their BI systems and their ability to take action in the marketplace. Yet existing analytics systems often fail to deliver value to end users. Confusing visualizations, poorly designed data queries and gaps in underlying data are major contributors in a BI solution's inability to deliver needed insights.

    By effectively redesigning BI applications, organizations can gain new insights and build deeper relationships with customers while maximizing performance. Effective BI tools can also help to optimize interventions and the use of healthcare resources. They can drive post-marketing research by unearthing early signals of value for investigation, help companies better engage and deliver value to their customers and contribute to improve patient outcomes. This information can advance the understanding of how medicine is practiced in the real world-from disease prevention through diagnosis, treatment and monitoring.

    Step 4: Making the Most of New Cloud Computing Technologies and Software-as-a-Service (SaaS) Models for Delivery
    Chief information officers (CIOs) are increasingly looking to adopt cloud technologies in order to bring the promise of technology to commercialization and business intelligence activities. They see the potential value of storing large, complex data sets, including electronic medical records and other real-world data, in the cloud. What's more, cloud companies have taken greater responsibility for maintaining government-compliant environments for health information.

    New cloud-based BI applications are fueling opportunities for life sciences companies to improve delivery of commercial applications, including performance management, advanced analytics, sales force automation, master data management and the handling of large unstructured data streams. As companies continue their journey toward BI maturity, getting the most from new technologies will remain a high priority. Leveraging cloud-based information management and business intelligence platforms will bring tremendous benefits to companies as approaches are revised amidst changing customer demands and an urgent need for efficiency.

    The Way Forward
    While each organization's journey will be unique, advancing in business intelligence maturity-and getting more value from data - can be achieved by all with these four steps. It's time for BI that's smarter and simpler and that realizes greater value from data. With focus and precision-and the support of business and technology experts-companies can hone in on the key indicators and critical questions that measure, predict and enhance performance.

    Source: ExecutiveInsight

  • Gaining advantages with the IoT through 'Thing Management'

    Gaining advantages with the IoT through 'Thing Management'

    Some are calling the industrial Internet of Things the next industrial revolution, bringing dramatic changes and improvements to almost every sector. But to be sure it’s successful, there is one big question: how can organizations manage all the new things that are part of their organizations’ landscapes?

    Most organizations see asset management as the practice of tracking and managing IT devices such as routers, switches, laptops and smartphones. But that’s only part of the equation nowadays. With the advent of the IoT, enterprise things now include robotic bricklayers, agitators, compressors, drug infusion pumps, track loaders, scissor lifts and the list goes on and on, while all these things are becoming smarter and more connected.

    These are some examples for specific industries:

    ● Transportation is an asset-intensive industry that relies on efficient operations to achieve maximum profitability. To help customers manage these important assets, GE Transportation is equipping its locomotives with devices that manage hundreds of data elements per second. The devices decipher locomotive data and uncover use patterns that keep trains on track and running smoothly.

    ● The IoT’s promise for manufacturing is substantial. The IoT can build bridges that help solve the frustrating disconnects among suppliers, employees, customers, and others. In doing so, the IoT can create a cohesive environment where every participant is invested in and contributing to product quality and every customer’s feedback is learned from. Smart sensors, for instance, can ensure that every item, from articles of clothing to top-secret defense weapons, can have the same quality as the one before. The only problem with this is that the many pieces of the manufacturing puzzle and devices in the IoT are moving so quickly that spreadsheets and human analysis alone are not enough to manage the devices.

    ● IoT in healthcare will help connect a multitude of people, things with smart sensors (such as wearables and medical devices), and environments. Sensors in IoT devices and connected “smart” assets can capture patient vitals and other data in real time. Then data analytics technologies, including machine learning and artificial intelligence (AI), can be used to realize the promise of value-based care. There’s significant value to be gained, including operational efficiencies that boost the quality of care while reducing costs, clinical improvements that enable more accurate diagnoses, and more.

    ● In the oil and gas industry, IoT sensors have transformed efficiencies around the complex process of natural resource extraction by monitoring the health and efficiency of hard-to-access equipment installations in remote areas with limited connectivity.

    ● Fuelled by greater access to cheap hardware, the IoT is being used with notable success in logistics and fleet management by enabling cost-effective GPS tracking and automated loading/unloading.

    All of these industries will benefit from the IoT. However, as the IoT world expands, these industries and others are looking for ways to track the barrage of new things that are now pivotal to their success. Thing Management pioneers such as Oomnitza help organizations manage devices as diverse as phones, fork lifts, drug infusion pumps, drones and VR headset, providing an essential service as the industrial IoT flourishes.

    Think IoT, not IoP

    To successfully manage these Things, enterprises are not only looking for Thing Management. They also are rethinking the Internet, not as the Internet of People (IoP), but as the Internet of Things (IoP). Things aren’t people, and there are three fundamental differences.

    Many more things are connected to the Internet than people

    John Chambers, former CEO of Cisco, recently declared there will be 500 billion things connected by 2024. That’s nearly 100 times the number of people on the planet.

    Things have more to say than people

    A typical cell phone has nearly 14 sensors, including an accelerometer, GPS, and even a radiation detector. Industrial things such as wind turbines, gene sequencers, and high-speed inserters can easily have over 100 sensors.

    Things can speak much more frequently

    People enter data at a snail’s pace when compared to the barrage of data coming from the IoT. A utility grid power sensor, for instance, can send data 60 times per second, a construction forklift once per minute, and a high-speed inserter once every two seconds.

    Technologists and business people both need to learn how to collect and put all of the data coming from the industrial IoT to use and manage every connected thing. They will have to learn how to build enterprise software for things versus people.

    How the industrial IoT will shape the future

    The industrial IoT is all about value creation: increased profitability, revenue, efficiency, and reliability. It starts with the target of safe, stable operations and meeting environmental regulations, translating to greater financial results and profitability.

    But there’s more to the big picture of the IoT than that. Building the next generation of software for things is a worthy goal, with potential results such as continually improving enterprise efficiency and public safety, driving down costs, decreasing environmental impacts, boosting educational outcomes and more. Companies like GE, Oomnitza and Bosch are investing significant amounts of money in the ability to connect, collect data, and learn from their machines.

    The IoT and the next generation of enterprise software will have big economic impacts as well. The cost savings and productivity gains generated through “smart” thing monitoring and adaptation are projected to create $1.1 trillion to $2.5 trillion in value in the health care sector, $2.3 trillion $11.6 trillion in global manufacturing, and $500 billion $757 billion in municipal energy and service provision over the next decade. The total global impact of IoT technologies could generate anywhere from $2.7 trillion to $14.4 trillion in value by 2025.

    Author: Timothy Chou

    Source: Information-management

  • Getting the most out of your data with analytics: three best practices

    Getting the most out of your data with analytics: three best practices

    Data has three main functions that provide value to the business: To help in business operations, to help the company stay in compliance and mitigate risk, and to make informed decisions using analytics.

    “Data can have an impact on your top line as well as your bottom line,” said Dr. Prashanth Southekal, CEO of DBP-Institute in a recent interview with DATAVERSITY®.

     “Just capturing, storing, and processing data will not transform your data into a business asset. Appropriate strategy and the positioning of the data is also required,” he said. Southekal shared best practices for analytics and ways to transform data into an asset for the business.

    Lack of Analytics Success

    Gartner predicts that by 2022, 90 percent of corporate strategies will explicitly mention information as a critical enterprise asset and analytics as an essential competency. “Given that the organizations across the world are looking at ways to glean insights from analytics and make good decisions today, not many companies are very successful in analytics,” he said.

    According to a recent McKinsey survey, most companies understand the importance of analytics and have adopted common best practices, Southekal remarked. Yet fewer than 20 percent have maximized the potential and achieved advanced analytics at scale. With this in mind, Southekal compiled a list of analytics best practices, using his experience working with successful analytics projects, projects with challenges, and those that fail.

    Data Must be Usable

    Data from texts, video, audio, and other similar types of data are in an unstructured form when the data is initially captured, he said. The process of conversion from a raw state into a processed format creates value because it becomes usable for insights and decision-making

    Intuition vs. Data

    “The real substitute for data is intuition,” he said. Insight for design-making can come from data or from intuition, so in companies where data literacy is poor, intuition will prevail over data in decision-making. Users no longer need to rely on intuition when they realize they can rely on better decisions made with good data.

    Top Three Best Practices for Analytics

    • Improve Data Quality: Southekal defines analytics as the process of gaining insight by using data to answer business questions. Unfortunately, Data Quality is very poor in most business enterprises, he said, and poor-quality data cannot provide reliable insights. Data Quality will continue to remain poor under the current business paradigm, where businesses are constantly evolving — both internally and externally — in response to changing market conditions. Mergers and acquisitions require internal and external changes to often disparate data sources and systems. “Data Quality is a moving target and you can’t assume that if your data is good today, it will continue to be good, even after two years.” One option is to wait for the quality to improve over time, but in order to move forward in the immediate future, Southekal suggests creating a work-around with data sampling, acquisition, and blending of data from external sources, as well as investments in feature engineering.
    • Improve Data Literacy: More companies are recognizing that Data Literacy is critical to their future success with digital technologies and data analytics. Poor data literacy ranks as the second largest barrier to success among Gartner’s survey of Chief Data Officers, he said, who feel increased responsibility to ensure that data is easily available to stakeholders to use for all their daily operations. Building a data culture and investing in data literacy can show great benefits.
    • Monetize Data: “Go beyond insights and make the picture a little bit bigger by talking about data monetization,” he said. One effective way to monetize data is to look at data products. Also, monetizations entails reducing expenses, mitigating risk, and creating new revenue streams with data products.

    Data Products

    In most places, he said, analytics initiatives are run like projects, with a fixed start and end date and a specific purpose. The focus and the resource commitment inherent in project-based thinking is good, but Southekal recommends also thinking about analytics as a potential data product.

    “LinkedIn is a data product. Bloomberg Solutions is also a data product. You can even build a report which gives you a sales margin and call it a data product.” The objective of building data products is to have scalable and long-term solutions, instead of a short-term solution that ends with the project. “Analytics as a strategic endeavor has to be a long-term initiative, so you have to treat analytics as a data product delivery mechanism, not just as a project initiative.”

    How to Build a Data Product

    He suggests considering the business as a network of customers, employees, vendors, and partners; look at business as an end-to-end value chain. Rather than seeing procurement, for example, as a single line of business, take into account the entire value chain within procurement. This process helps identify all of the players and what their value propositions are, “And will also identify where the value leaks are in the whole chain,” he said. The solutions created to fix those leaks are potential revenue-generating products.

    Analytics Best Practices

    Southekal published a book, Analytics Best Practices. He said that the book offers prescriptive and practical guidance that can be used in a variety of settings. His goal was to address the four pillars of analytics — Data Management, Data Engineering, Data Science, and Data Visualization — with ten best practices, and to do so by focusing on concepts rather than on specific tools or platforms. “It’s practical, it’s complete, it’s neutral.”


    DBP Institute helps companies get the most out of their digital technologies and data by implementing new solutions or by optimizing existing solutions, he said. They work primarily in higher education and corporate settings, as well as offering analytics education and training online and at conferences.

    Recipes for Success

    Because of the effects of the COVID-19 pandemic, many companies are turning to data and digital technologies as key enablers. Southekal created a reference architecture document he calls The Reference Architecture for Digital Enablement (TRADE) to help companies with their digital enablement and analytics initiatives. Analytics is ultimately about data, he said, but to capture data, you need mechanisms for data storage, processing, and integration. “I’ve collected ‘recipes’ for best practices, and now when I work with customers, I bring TRADE, my implementation cookbook.”

    Author: Amber Lee Dennis

    Source: Dataversity

  • Getting Your Machine Learning Model To Production: Why Does It Take So Long?

    Getting Your Machine Learning Model To Production: Why Does It Take So Long?

    A Gentle Guide to the complexities of model deployment, and integrating with the enterprise application and data pipeline. What the Data Scientist, Data Engineer, ML Engineer, and ML Ops do, in Plain English.

    Let’s say we’ve identified a high-impact business problem at our company, built an ML (machine learning) model to tackle it, trained it, and are happy with the prediction results. This was a hard problem to crack that required much research and experimentation. So we’re excited about finally being able to use the model to solve our user’s problem!

    However, what we’ll soon discover is that building the model itself is only the tip of the iceberg. The bulk of the hard work to actually put this model into production is still ahead of us. I’ve found that this second stage could take even up to 90% of the time and effort for the project.

    So what does this stage comprise of? And why is it that it takes so much time? That is the focus of this article.

    Over several articles, my goal is to explore various facets of an organization’s ML journey as it goes all the way from deploying its first ML model to setting up an agile development and deployment process for rapid experimentation and delivery of ML projects. In order to understand what needs to be done in the second stage, let’s first see what gets delivered at the end of the first stage.

    What does the Model Building and Training phase deliver?

    Models are typically built and trained by the Data Science team. When it is ready, we have model code in Jupyter notebooks along with trained weights.

    • It is often trained using a static snapshot of the dataset, perhaps in a CSV or Excel file.
    • The snapshot was probably a subset of the full dataset.
    • Training is run on a developer’s local laptop, or perhaps on a VM in the cloud

    In other words, the development of the model is fairly standalone and isolated from the company’s application and data pipelines.

    What does “Production” mean?

    When a model is put into production, it operates in two modes:

    • Real-time Inference — perform online predictions on new input data, on a single sample at a time
    • Retraining — for offline retraining of the model nightly or weekly, with a current refreshed dataset

    The requirements and tasks involved for these two modes are quite different. This means that the model gets put into two production environments:

    • A Serving environment for performing Inference and serving predictions
    • A Training environment for retraining

    Real-time Inference and Retraining in Production (Source: Author)

    Real-time Inference is what most people would have in mind when they think of “production”. But there are also many use cases that do Batch Inference instead of Real-time.

    • Batch Inference — perform offline predictions nightly or weekly, on a full dataset

    Batch Inference and Retraining in Production (Source: Author)

    For each of these modes separately, the model now needs to be integrated with the company’s production systems — business application, data pipeline, and deployment infrastructure. Let’s unpack each of these areas to see what they entail.

    We’ll start by focusing on Real-time Inference, and after that, we’ll examine the Batch cases (Retraining and Batch Inference). Some of the complexities that come up are unique to ML, but many are standard software engineering challenges.

    Inference — Application Integration

    A model usually is not an independent entity. It is part of a business application for end users eg. a recommender model for an e-commerce site. The model needs to be integrated with the interaction flow and business logic of the application.

    The application might get its input from the end-user via a UI and pass it to the model. Alternately, it might get its input from an API endpoint, or from a streaming data system. For instance, a fraud detection algorithm that approves credit card transactions might process transaction input from a Kafka topic.

    Similarly, the output of the model gets consumed by the application. It might be presented back to the user in the UI, or the application might use the model’s predictions to make some decisions as part of its business logic.

    Inter-process communication between the model and the application needs to be built. For example, we might deploy the model as its own service accessed via an API call. Alternately, if the application is also written in the same programming language (eg. Python), it could just make a local function call to the model code.

    This work is usually done by the Application Developer working closely with the Data Scientist. As with any integration between modules in a software development project, this requires collaboration to ensure that assumptions about the formats and semantics of the data flowing back and forth are consistent on both sides. We all know the kinds of issues that can crop up. eg. If the model expects a numeric ‘quantity’ field to be non-negative, will the application do the validation before passing it to the model? Or is the model expected to perform that check? In what format is the application passing dates and does the model expect the same format?


    Real-time Inference Lifecycle (Source: Author)

    Inference — Data Integration

    The model can no longer rely on a static dataset that contains all the features it needs to make its predictions. It needs to fetch ‘live’ data from the organization’s data stores.

    These features might reside in transactional data sources (eg. a SQL or NoSQL database), or they might be in semi-structured or unstructured datasets like log files or text documents. Perhaps some features are fetched by calling an API, either an internal microservice or application (eg. SAP) or an external third-party endpoint.

    If any of this data isn’t in the right place or in the right format, some ETL (Extract, Transform, Load) jobs may have to be built to pre-fetch the data to the store that the application will use.

    Dealing with all the data integration issues can be a major undertaking. For instance:

    • Access requirements — how do you connect to each data source, and what are its security and access control policies?
    • Handle errors — what if the request times out, or the system is down?
    • Match latencies — how long does a query to the data source take, versus how quickly do we need to respond to the user?
    • Sensitive data — Is there personally identifiable information that has to be masked or anonymized.
    • Decryption — does data need to decrypted before the model can use it?
    • Internationalization — can the model handle the necessary character encodings and number/date formats?
    • and many more…

    This tooling gets built by a Data Engineer. For this phase as well, they would interact with the Data Scientist to ensure that the assumptions are consistent and the integration goes smoothly. eg. Is the data cleaning and pre-processing done by the model enough, or do any more transformations have to be built?

    Inference — Deployment

    It is now time to deploy the model to the production environment. All the factors that one considers with any software deployment come up:

    • Model Hosting — on a mobile app? In an on-premise data center or on the cloud? On an embedded device?
    • Model Packaging — what dependent software and ML libraries does it need? These are typically different from your regular application libraries.
    • Co-location — will the model be co-located with the application? Or as an external service?
    • Model Configuration settings — how will they be maintained and updated?
    • System resources required — CPU, RAM, disk, and most importantly GPU, since that may need specialized hardware.
    • Non-functional requirements — volume and throughput of request traffic? What is the expected response time and latency?
    • Auto-Scaling — what kind of infrastructure is required to support it?
    • Containerization — does it need to be packaged into a Docker container? How will container orchestration and resource scheduling be done?
    • Security requirements — credentials to be stored, private keys to be managed in order to access data?
    • Cloud Services — if deploying to the cloud, is integration with any cloud services required eg. (Amazon Web Services) AWS S3? What about AWS access control privileges?
    • Automated deployment tooling — to provision, deploy and configure the infrastructure and install the software.
    • CI/CD — automated unit or integration tests to integrate with the organization’s CI/CD pipeline.

    The ML Engineer is responsible for implementing this phase and deploying the application into production. Finally, you’re able to put the application in front of the customer, which is a significant milestone!

    However, it is not yet time to sit back and relax 😃. Now begins the ML Ops task of monitoring the application to make sure that it continues to perform optimally in production.

    Inference — Monitoring

    The goal of monitoring is to check that your model continues to make correct predictions in production, with live customer data, as it did during development. It is quite possible that your metrics will not be as good.

    In addition, you need to monitor all the standard DevOps application metrics just like you would for any application — latency, response time, throughput as well as system metrics like CPU utilization, RAM, etc. You would run the normal health checks to ensure uptime and stability of the application.

    Equally importantly, monitoring needs to be an ongoing process, because there is every chance that your model’s evaluation metrics will deteriorate with time. Compare your evaluation metrics to past metrics to check that there is no deviation from historical trends.

    This can happen because of data drift.

    Inference — Data Validation

    As time goes on, your data will evolve and change — new data sources may get added, new feature values will get collected, new customers will input data with different values than before. This means that the distribution of your data could change.

    So validating your model with current data needs to be an ongoing activity. It is not enough to look only at evaluation metrics for the global dataset. You should evaluate metrics for different slices and segments of your data as well. It is very likely that as your business evolves and as customer demographics, preferences, and behavior change, your data segments will also change.

    The data assumptions that were made when the model was first built may no longer hold true. To account for this, your model needs to evolve as well. The data cleaning and pre-processing that the model does might also need to be updated.

    And that brings us to the second production mode — that of Batch Retraining on a regular basis so that the model continues to learn from fresh data. Let’s look at the tasks required to set up Batch Retraining in production, starting with the development model.


    Retraining Lifecycle (Source: Author)

    Retraining — Data Integration

    When we discussed Data Integration for Inference, it involved fetching a single sample of the latest ‘live’ data. On the other hand, during Retraining, we need to fetch a full dataset of historical data. Also, this Retraining happens in batch mode, say every night or every week.

    Historical doesn’t necessarily mean “old and outdated” data — it could include all of the data gathered until yesterday, for instance.

    This dataset would typically reside in an organization’s analytics stores, such as a data warehouse or data lake. If some data isn’t present there, you might need to build additional ETL jobs to transfer that data into the warehouse in the required format.

    Retraining — Application Integration

    Since we’re only retraining the model by itself, the whole application is not involved. So no Application Integration work is needed.

    Retraining — Deployment

    Retraining is likely to happen with a massive amount of data, probably far larger than what was used during development.

    You will need to figure out the hardware infrastructure needed to train the model — what are its GPU and RAM requirements? Since training needs to complete in a reasonable amount of time, it will need to be distributed across many nodes in a cluster, so that training happens in parallel. Each node will need to be provisioned and managed by a Resource Scheduler so that hardware resources can be efficiently allocated to each training process.

    The setup will also need to ensure that these large data volumes can be efficiently transferred to all the nodes on which the training is being executed.

    And before we wrap up, let’s look at our third production use case — the Batch Inference scenario.

    Batch Inference

    Often, the Inference does not have to run ‘live’ in real-time for a single data item at a time. There are many use cases for which it can be run as a batch job, where the output results for a large set of data samples are pre-computed and cached.

    The pre-computed results can then be used in different ways depending on the use case. eg.

    • They could be stored in the data warehouse for reporting or for interactive analysis by business analysts.
    • They could be cached and displayed by the application to the user when they log in next.
    • Or they could be cached and used as input features by another downstream application.

    For instance, a model that predicts the likelihood of customer churn (ie. they stop buying from you) can be run every week or every night. The results could then be used to run a special promotion for all customers who are classified as high risks. Or they could be presented with an offer when they next visit the site.

    A Batch Inference model might be deployed as part of a workflow with a network of applications. Each application is executed after its dependencies have completed.

    Many of the same application and data integration issues that come up with Real-time Inference also apply here. On the other hand, Batch Inference does not have the same response-time and latency demands. But, it does have high throughput requirements as it deals with enormous data volumes.


    As we have just seen, there are many challenges and a significant amount of work to put a model in production. Even after the Data Scientists ready a trained model, there are many roles in an organization that all come together to eventually bring it to your customers and to keep it humming month after month. Only then does the organization truly get the benefit of harnessing machine learning.

    We’ve now seen the complexity of building and training a real-world model, and then putting it into production. In the next article, we’ll take a look at how the leading-edge tech companies have addressed these problems to churn out ML applications rapidly and smoothly.

    And finally, if you liked this article, you might also enjoy my other series on Transformers, Audio Deep Learning, and Geolocation Machine Learning.

    Author: Ketan Doshi

    Source: Towards Data Science

  • Healthcare analytics and the opportunities to improve patient care

    Healthcare analytics and the opportunities to improve patient care

    Healthcare: everyone needs it, it’s a rapidly technologizing industry, and it produces immense amounts of data every day.

    To get a sense of where analytics fit into this vital market, Sisense interviewed Hamza Jap-Tjong, CEO and Co-Founder of GeriMedica Inzicht, a GeriMedica subsidiary. GeriMedica is a multi-disciplinary electronic medical record (EMR) company servicing the elderly care market and as such, their SaaS platform is filled with data of all kinds. Recently, they rolled out analytics that practitioners could use to improve the quality of care (versus the prior main use case in healthcare analytics, which was done by the billing and finance departments). This helps keep practitioners focused on helping patients instead of spending (wasting) hours in a software product. Hamza opened up about the state of healthcare analytics, how it can improve care for patients, and where the industry is going.

    The state of healthcare analytics

    As previously mentioned, the healthcare industry creates tons of data every day from a wide array of sources.

    'I think tons of data might be an understatement', says Hamza, citing a Stamford study. 'They were talking about data on the scale of exabytes (an exabyte equals a billion gigabytes). Where does all that data come from? Fitbits, iPhones, fitness devices on your person… healthcare data is scattered everywhere: not only treatment plans and records created by practitioners, but also stored in machines (X-rays, photographs, etc.)'.

    Data is the new oil, but without the right tools, the insights locked in the data can’t help anyone. At present, few healthcare organizations (let alone frontline practitioners) are taking advantage of the data at their disposal to improve patient care. Moreover, these teams are dealing with amounts of information so vast that they are impossible to make sense of without help (like from a BI or analytics platform). They can’t combine these datasets to gain a complete picture without help, either. Current software offerings, even if they have some analytical capabilities for the data that they capture, often can’t mash it up with other datasets.

    'In my opinion, we could really improve the data gathering', says Hamza. 'As well as the way we use that data to improve patient care. What we know is that when you look at doctors, nurses, physical therapists, everybody close to care processes and patients, is hankering for data and insights and analytics and we see that at the moment there isn’t a tool that is good enough or easy enough for them to use to gain the insights that they are looking for'.

    Additionally, the current generation of medical software has a high barrier to entry/learning curve when it comes to getting useful insights out. All these obstacles prevent caregivers from helping clients as much as they might be able to with analytics that are easier to use.

    Improving patient care (and improving analytics for practitioners)

    Analytics and insight-mining systems have huge potential to improve patient care. Again, healthcare data is too massive for humans to handle unaided. However, there is hope: Hamza mentioned that AI systems were already being used in medical settings to aggregate research and present an array of options to practitioners without them having to dig through numerous sources themselves.

    'Doctors or nurses usually don't work nine-to-five. They work long shifts and their whole mindset is focused on solving mysteries and helping the patients. They don't have time to scour through all kinds of tables and numbers. They want an easy-to-understand dashboard that tells a story from A to Z in one glance and answers their question'.

    This is a huge opportunity for software and analytics companies to help improve patient care and user experience. Integrating easy-to-understand dashboards and analytics tools within medical software lowers the barrier to entry and serves up insights that practitioners can use to make better decisions. The next step is also giving clinicians the right tools to build their own dashboards to answer their own questions.

    The future of healthcare analytics

    Many healthcare providers might not know how much analytics could be improving their work and the care they give their patients. But they certainly know that they’re spending a lot of time gathering information and putting it into systems (and, again, that they have a ton of data). This is slowly changing today and will only accelerate as time goes on. The realization of how much a powerful analytics and BI system could help them with data gathering, insight harvesting, and providing better care will drive more organizations to start using a software’s analytics capabilities as a factor in their future buying decisions.

    Additionally, just serving up insights won’t be enough. As analytics become more mainstreamed, users will want the power to dig into data themselves, perform ad hoc analyses, and design their own dashboards. With the right tools and training, even frontline users like doctors and nurses can be empowered to create their own dashboards to answer the questions that matter most to them.

    'We have doctors who are designers', says Hamza. 'They are designing their own dashboards using our entire dataset, combining millions of rows and records to get the answers that they are looking for'.

    Builders are everywhere. Just as the healthcare space is shifting away from only using analytics in financial departments and putting insights into the hands of frontline practitioners, the right tools democratize the ability to create new dashboards and even interactive analytics widgets and empower anyone within an organization to get the answers and build the tools they need. Such as many other industries, healthcare has to go through a technological transformation.

    Creating better experiences

    When it comes to the true purpose of healthcare analytics, Hamza summed it up perfectly:

    'In the end, it’s all about helping end users create a better experience'.

    The staggering volume of data that the healthcare industry creates presents a huge opportunity for analytics to find patterns and insights and improve the lives of patients. As datasets become more massive and the analytical questions become more challenging, healthcare teams will rely more and more on the analytics embedded within their EMR systems and other software. This will lead them to start using the presence (or lack thereof) and quality of those analytics when making decisions. Software companies that understand this will build solutions that answer questions and save lives, the ones that don’t might end up flatlining.

    Author: Jack Cieslak

    Source: Sisense

  • How AI reinforces stereotypes through biased data

    How AI reinforces stereotypes through biased data

    Artificial intelligence (AI) software systems have been under attack for years for systematizing bias against minorities. Banks using AI algorithms for mortgage approvals are systematically rejecting minority applicants. Many recruiting tools using AI are biased against minority applicants. In health care, African Americans are victims of racial bias from AI-based hospital diagnostic tools. If you understand how data science research produces the models that power AI systems, you will understand why they perpetuate bias, and also how to fix it.

    Merriam Webster defines stereotype as “something conforming to a fixed or general pattern.” Machine learning algorithms, which build models that power AI software systems, are simply pattern matching machines. Deep learning models, which are most commonly used in the latest wave of AI-powered systems, discover patterns in data and perpetuate those patterns in future data. This behavior is great if the goal is to replicate the world represented by the training data. However, for a variety of reasons, that isn’t always the best outcome.

    The root cause of bias in data-driven AI systems comes down to how the data to train those systems is collected, and whether or not the decision-making in that data represents the corporate or societal goals of the deployed system based on that data. Data for AI systems can be derived from many sources, including: the real world, data collection efforts, and synthetic data generation.

    The real world contains a lot of data. For instance, the mortgage applications from a period in the past constitute a training set for a mortgage approval system. Job applications, along with the hiring decisions and the outcomes of those hires, provide data to a human resources hiring system. Similarly, medical data from patients over a time period, including their symptoms, histories, diagnoses, and treatments, might be a useful data set for a medical treatment system.

    Absent valid, useful, or available real-world data, machine learning researchers and application developers can collect data “artificially”, deciding what data they want to collect, deploying a team to design a data collection process, and going out into the world and collecting the data proactively. This data set might be more targeted to the needs of the system builder, but the decisions made in this data collection process might skew the results of the models built from that data, introducing a form of bias based on the design decisions of the data collection process.

    There is a third possibility: synthetic data. If the appropriate data isn’t available, research teams can deploy synthetic data generators, to create artificial data that they believe represents the real world data the proposed application will see, along with the desired analysis that the developer wants their system to assign to that data when it sees it.

    In all of these cases, there is a presumption that the AI system should model the world as it is and perpetuate the behavior it sees in the training data, regardless of its source. And, in a world historically influenced by systemic racism and bias against broad classes of minority groups, or in the case of women, even majority groups, it is not clear at all that the best outcome is a replication of the decision-making of the past.

    If we know that qualified African American mortgage applicants have been denied mortgages in the past, or that the economic system has been biased against African Americans in the past, so that they are more likely to default on mortgages than white applicants, then training AI systems on historical mortgage data is only going to perpetuate the inherent bias encoded historically in the financial system. Similarly, if qualified minorities and women have been underrepresented in the job market in the past, training off of historical hiring data will likely reinforce that bias. If the real world has made the wrong decision in the past, due to systemic bias in societal infrastructure, training systems on historical data is going to reinforce that system bias, but in an algorithmic way, as opposed to in an anecdotal way, perhaps even making the problem worse.

    An effective way to rectify this problem is to create targets for the proposed behavior of data-driven systems, and then engineer or curate training data sets that represent the desired outcomes. This process will allow machine learning training algorithms to learn patterns for making accurate predictions on new data while ensuring that the models capture the inputs and outputs proactively.

    How does this work in reality? Let’s say you want to build a mortgage approval model that will make good decisions about loan risk, but which will treat minority applicants on a par with non-minority applicants. However, it turns out the historical data you have is biased against minority applicants. One simple way to rectify this problem is to filter the data so that the percentage of minority applicants approved for mortgages matches the percentage of non-minority applicants. By skewing the training data to represent the outcome you want to see, as opposed to the way the data reflects historical biases, you can push the machine learning algorithms to create models that treat minority applicants more fairly than they have been treated in the past.

    Some people may want to view machine learning models simply as tools used to capture and represent the world as it has been, flaws and all. Given the reality of systemic racism pervading the systems that run our society, this has led to AI-driven software encoding the racism of the past and present and amplifying it through the power of technology. We can do better, however, by using the pattern-matching ability of machine learning algorithms, and by curating the data sets they are trained on, to make the world what we want it to be, not the flawed version of itself that it has more recently been.

    Author: David Magerman

    Source: Insidebigdata

  • How business analytics can benefit your business strategically

    How business analytics can benefit your business strategically

    Business analytics can provide companies with an accurate and holistic view of their business. Executives and managers now have the ability to use data for real-time, actionable insights into everything from customer buying patterns to inventory management without having to rely on IT for outdated, static reports. In this blog, we discuss five strategic benefits of business analytics .

    Strategic benefit of business analytics 1: staff will have faster access to data

    Comparison: A conservative wait time for an IT generated report is two days. In today’s fast-paced world, a lot can change in two days and usually by the time reports are received, the data is out-of-date. Your executives and managers need to be able to access up-to-date data in order to make quick decisions that will maintain your competitive advantage.

    How would your business look: With access to up-to-date data, your sales team is empowered when interacting with prospects. Over time, this will lead to increased revenue opportunities as sales staff become aware of what customers are buying and, more importantly, what they are not buying. With this data at their fingertips, your sales managers are able to monitor their teams’ performance on a daily basis to identify and implement strategies to improve performance overall. With an easy-to-learn and intuitive BI tool like Phocas, the typical ROI timeframe is between 2-4 months after implementation, but can sometimes be even faster. 

    Strategic benefit of business analytics 2: increase customer acquisition and retention

    Comparison: Sales reps rely on the right information in the right moment. Providing your reps with potentially outdated data may result in your reps wasting time as they hunt for current facts or figures. This could result in lost sales opportunities.

    How your business would look: Armed with current, relevant access to data, your reps are able to engage in more meaningful conversations that are of real value to your customers. By having  data on customer behavior patterns, previous customer feedback, customer preferences, and buying habits, your reps will know what your customers truly want and have the ability to demonstrate the value of your product or service to them. When prospects feel heard, they are more inclined to become loyal and satisfied customers. A quality BI tool will be accessible from mobile devices ensuring your reps have access to your data even when they are out of the office.

    Strategic benefit of business analytics 3: measure the effectiveness of campaigns  

    Comparison: Traditional marketing efforts are a game of trial and error. Businesses implement a strategy and wait to see if their efforts pay off. If sales increase, it’s assumed the strategy is successful. If not, the strategy is tweaked or scrapped for a new plan-of-action.

    How your business would look: BI empowers you to design, monitor, and evaluate the success of your promotional and marketing campaigns by offering real-time insight into how customers are reacting to them. By identifying which campaigns receive the best responses, you can streamline your marketing budget and allocate funds for the best ROI. If a campaign is not generating a positive response, you are able to quickly reorganize the promotion or customize the campaign message accordingly.

    Strategic benefit of business analytics 4: New sales opportunities will regularly present themselves

    Comparison: An Excel spreadsheet can inform your team that sales for a specific product are up, but it can’t clarify whether a specific color or other characteristic is performing better than others. Nor can spreadsheets indicate why certain products are underperforming. BI provides businesses with the ability to quickly evaluate data to identify sales issues and opportunities more effectively than ever before. 

    How your business would look: BI allows your team to quickly detect emerging sales trends by analyzing company data on customers as well as various market conditions. Your team will have the ability to swiftly visualize detailed changes in customer behaviors to reveal emerging opportunities.  By leveraging these insights, sales teams can improve the accuracy of their sales predictions and respond accordingly.

    Strategic benefit of business analytics 5: More stock moving off the shelves

    Comparison: Static reports identify the quantity of a product a company has on hand when the report is generated, and which products are slow moving or have become dead stock sitting in your warehouse graveyard. However, these reports cannot identify the cause of slow moving or dead stock, nor prevent future dead stock. It’s difficult for a company to avoid this situation without a tool in place to accurately monitor the purchasing process.

    How your business would look: BI can help you to isolate poor purchasing decisions because you are no longer relying on outdated static reports. With BI you are able to monitor inventory-to-purchase ratio, stock turns, and slow-moving stock by product, territory, or manufacturer. With BI, you are able to refine your inventory management processes. By identifying product selling patterns, you are able to reduce excess inventory and the cost to maintain it. Visualizations provide a clear picture of how much to order, when, and at what price. In addition to ensuring your stock moves, your managers are able to utilize the information to effectively adjust pricing tiers to increase your profit margins.

    Having your customer, sales, and inventory data at your fingertips gives you leverage to rapidly adapt to an ever-changing sales climate. With the right Business Intelligence tool in place companies are able to increase profit margins, reduce spending, and achieve competitive excellence.

    Source: Phocas Software

  • How data can aid young homeless people

    How data can aid young homeless people

    What comes to mind when you think of a “homeless person”? Chances are, you’llpicture an adult, probably male, dirty, likely with some health conditions, including a mental illness. Few of us would immediately recall homeless individuals as family members, neighbors, co-workers and other loved ones. Fewer still arelikely aware of how many youths (both minors and young adults) experience homelessness annually.

    Homeless youth is a population who can become invisible to us in many ways. These youth may still be in school, may go to work, and interact with many of our public and private systems, yet not have a reliable and safe place to sleep, eat, do homework and even build relationships.

    Youth experiencing homelessness is, in fact, far more prevalent than many people realize, as the Voices of Youth Count research briefs have illustrated. Approximately 1 in 10 young adults (18-25) and 1 in 30 youth (13-17) experience homelessness over the course of a year. That’s over 4 million individuals.

    When I worked for the San Bernardino County Department of Behavioral Health, we ran programs specifically targeting homeless youth. The stories of lives changed from supportive care is still motivating!Myrole at the County focused primarily on data. At SAS, I have continued to explore ways data can support whole person care, which includes the effects of homelessness on health. 

    I see three primary ways data can be powerful in helping homeless youth: 

    1. Data raises awareness

    Without good data, it’s hard to make interventionithout good data, it’sWithout good data, it’s hard to make interventions. Health inequities is a good example of this: If we don’t know where the problem is, we can’t change our policies and programs.

    The National Conference of State Legislatures has compiled a range of data points about youth homelessness in the United States and informationon related policy efforts. This is wonderful information, and I appreciate how they connect basic data with policy.

    At the same time, this kind of data can be complicated to compile. Information about youth experiencing homelessness can be siloed, which inhibits a larger perspective, like a regional, statewide, or even national view. We also know there are many intersections with other public and private systems, including education, foster care, criminal justice, social services, workforce support and healthcare. Each system has a distinct perspective and data point.

    What would happen if we were able to have a continuous whole person perspective of youth experiencing homelessness? How might that affect public awareness and, by extension, public policy to help homeless youth?

    2. Data informs context and strengths

    While chronic health conditions are often present with homeless youth, this is also an issue with family members, leading to family homelessness. First off, this is an important example of not looking at people at just individuals, but as part of a bigger system. That fundamentally requires a more integrative perspective.

    Further, homeless youth experience higher rates of other social factors, such as interactions with foster care, criminal justice, and educational discipline (e.g., suspensions). Add on top of that other socio-economic contexts, including racial disparities and more youth from the LGBTQ+ communities.

    Just as I talked about the evaluation of suffering in general, having a more whole person perspective on homelessness is critical in understanding the true context of what may be contributing to homelessness… as well as what will help with it.

    It is easy to focus on all the negative outcomes and risk factors of homelessness in youth. What happens when we can start seeing folks experiencing homelessness as loved and meaningful members of our communitiesData that provides more holistic perspectives, including strengths, could help shift that narratives and even combat stigma and discrimination.

    In my role at San Bernardino County, I helped oversee and design program evaluation, including using tools, like the Child and Adolescent Needs and Strengths (CANS), to assess more holistic impacts of acute programs serving homeless youth. Broadening out our assessment beyond basic negative outcomes to includemetrics like resilience, optimism, and social support not only reinforces good interventions, but also helps us to see the youth experiencing homelessness as youth worthy of investment.

    That’s invaluable.

    3. Data empowers prevention and early intervention 

    Finally, homelessness is rarely a sudden event. In most cases, youth and their families experiencing homelessness have encountered one or more of our community systems before becoming homeless. I’ve talked before about using more whole person data to proactively identify people high-risk people across public (especially health) systems.

    This approach can lead to early identification of people at risk of homelessness. If we can identify youth and family in an early encounter with health, social services, foster care or even the criminal justice system, could we better prevent homelessness in the first place? Some people will still experience homelessness, but could this same approach also help us better identify what kinds of interventions could reduce the duration of homelessness and prevent it from recurring?

    With whole person data, we can continue to refine our interventions and raise more awareness of what best helps youth experiencing homelessness. For instance, research has recognized the value of trauma-informed care with this population. The National Child Traumatic Stress Network has a variety of information that can empower anyone to better help homelessyouth.

    In honor of National Homeless Youth Awareness Month and recognizing the importance of homelessness in general, I encourage you to explore some of these resources and read at least one to become more aware of the reality of the experience of homeless youth. That’s the first step in moving us forward.

    Author: Josh Morgan

    Source: SAS

  • How organizations can control the carbon emissions caused by using the cloud

    How organizations can control the carbon emissions caused by using the cloud

    The move to the cloud is not necessarily a carbon-free transition, which means businesses need to be folding cloud-based emissions into their overall ESG strategy.

    Cloud computing is an increasing contributor to carbon emissions because of the energy needs of data centers.

    With demand for digital services and cloud-based computing rising, industry efforts concentrated on energy efficiency will be required. This means organizations across all verticals must fold their cloud carbon footprint into their environmental, social, and governance (ESG) targets.

    This is especially true for those organizations that have committed to net-zero or science-based targets or other similar decarbonization commitments, as cloud computing would need to be accounted for in the calculations.

    Depending on an organization’s business model, and especially for companies that focus on digital services, the energy consumed through cloud computing can be a material portion of their overall emissions.

    In addition, shifting to the cloud can contribute to the reduction of the carbon footprint if it is approached with intent, and explicitly built into the DNA of technology deployment and management.

    Major Cloud Providers Offering Insight

    Casey Herman, PwC US ESG leader, explained that the major cloud service providers -- Google, Amazon, Microsoft -- are already providing data on energy usage and emissions on a regular basis.

    “Smaller players are still playing catch-up either providing online calculations, which require customers to be responsible for securing these values, or there is no information provided at all,” he says. “CIOs should have their operational teams monitor these and preferentially select those service providers that provide real-time tools to optimize the energy usage.”

    He notes that CIOs should also increasingly build or purchase tools that allow a holistic view across all the cloud computing impacts: Currently, they would need to look at each provider separately and then aggregate them external to any tools that may be provided by service providers.

    “At PwC, we have been piloting an IT sustainability dashboard that collects data from public cloud providers and on-premises systems and then provides views on key sustainability metrics like energy reuse efficiency or carbon usage effectiveness,” he adds.

    Herman says that ultimately, organizations are seeking greater use of data for more advanced analysis, which will consume increasingly more computing power, which translates to more energy.

    “Cloud service providers have been quick to reduce their carbon footprints, including public statements and investing money in renewables and carbon capture projects,” he says. “These organizations are putting in a carbon-neutral infrastructure that could then support the current and growing demand for data, analytics, and computing power.”

    Using Migration to Install Tools

    In fact, shifting to the cloud (provided it's the right provider) could reduce a company's carbon footprint through optimization and rationalization of on-premises/private data centers to more efficient (energy and carbon) cloud-based data centers.

    A company can also use their cloud migration program as a catalyst to transform their technology footprint and become environmentally conscious by design.

    Herman says that this can include re-architecting applications and building within enterprise architecture a strategy to utilize more discrete and reusable components (microservices, APIs), preventing wasteful use of energy in the cloud.

    The key to getting cloud carbon impact initiatives underway is aligning the ambition and strategy of the overall business with the IT and digital function around ESG and being an active champion of the ESG agenda within the organization.

    “Without the tools to measure the carbon footprint of their cloud footprint, companies will struggle to holistically aggregate relevant carbon impact for their IT department or manage to net zero, especially when these represent meaningful parts of their overall footprint,” Herman says.

    He explains that measurement tools and processes will also allow the organization to leverage that same data and insights to support decarbonization agendas and strategies in the business.

    AI Provides Insight into Cloud Emissions

    For Chris Noble, co-founder and CEO of Cirrus Nexus, the focus for his company has been on an artificial intelligence designed to help companies quantify and shrink the level of carbon their cloud operations produce.

    “By giving organizations the chance to impose a cost on that carbon, it allows them to make a better-informed business decision as their impact on the environment, and then to drive that actual behavior,” he says.

    By giving businesses a window into how much emissions their cloud computing demands are producing, those organizations are then able to form a roadmap that will help their ESG strategy.

    This is a part of transparency reporting, which Noble notes will be increasingly required through government regulations.

    “There's a lot of people making claims about carbon neutrality, but there's no way to verify that -- there's no proof,” he says. “What we allow companies to do is to see what that activity is.”

    He says that for IT departments to understand cloud-based carbon emissions as a business problem, they need parameters and metrics by which they can tag on cost on the issue and work toward resolving it.

    “How do we educate, inform and drive that behavioral change across their environments?” Noble says. “We spend a lot of time doing that.”

    Reliable Data Intelligence is Critical

    Elisabeth Brinton, Microsoft’s corporate vice president of sustainability, says that accurate, reliable data intelligence is critical for the success of ESG initiatives.

    “For organizations to truly address the sustainability imperative, they need continuous visibility and transparency into the environmental footprint of their entire operations, their products, the activities of their people and their value chain,” she says.

    Just as organizations rely on real-time financial reporting and forecasts to guide decisions that affect the fiscal bottom line, they need foundational intelligence to inform sustainability-related decisions.

    “Leveraging a cloud platform offers organizations comprehensive, integrated, and increasingly automated sustainability insights to help monitor and manage their sustainability performance,” Brinton says.

    With cloud technology and a partner ecosystem, cloud providers like Microsoft are also bringing integrated solutions to connect organizations and their value chain, ultimately helping organizations integrate sustainability into their culture, activities, and processes to prioritize actions to minimize their environmental impact.

    Microsoft Cloud for Sustainability is the company’s first horizontal industry cloud designed to work across multiple industries, with solutions that can be customized to specific industry needs. At its core is a data model that aligns with Greenhouse Gas Protocols -- the standard in identifying and recording emissions information.

    Brinton explains as the company operationalizes its sustainability plan, Microsoft is sharing its expertise and developing tools and methods customers can replicate.

    “We’re also thinking about where we’re going, what we have to solve as a company to walk our own talk, and how we’re going to enable our customers to deal with that complexity so that at the end, they’re coming out on the other side as well,” Brinton says.

    The Customer Demand for Clean Clouds

    Kalliopi Chioti, chief ESG officer at financial services software firm Tememos, notes banks are heavy users of datacenters and so being a part of this positive trend -- moving from legacy on-premises servers to modern cloud infrastructure -- will have a significant impact on emissions.

    Temenos Banking Cloud, the company’s next-generation SaaS, incorporates ESG-as-a-service to help banks reduce their energy and emissions, gain carbon insights from using their products, and to track their progress towards reaching their sustainability targets.

    It also runs on public cloud infrastructure, and the hyperscalers Temenos partners with have all made commitments to sustainability goals, science-based targets and using 100% renewable energy. “All these energy efficiencies are passed onto our clients,” Chioti says. “Let’s also remember that banks are in a unique position to influence the transition to a low-carbon economy.”

    She points out that the move to the cloud also has commercial implications: Consumers are not passive bystanders to the climate agenda, and they are increasingly matching their money with their values and voting with their wallets.

    “If companies want to continue to thrive and grow in the new era, they need to listen to their customers,” she says. “That starts with using cloud banking solutions to transform their climate credentials and show their customers the work they are doing to transition to a low-carbon global economy.”

    Author: Nathan Eddy

    Source: InformationWeek

  • How SAS uses analytics to help with the covid-19 vaccination process

    How SAS uses analytics to help with the covid-19 vaccination process

    The management of the COVID-19 vaccination program is one of the most complex tasks in modern history.  Even without the added complications of administering the vaccine during a pandemic, the race to vaccinate the populations who need it most all while maintaining the necessary cold-storage protocols, meeting double dose requirements, and still convincing populations of the vaccine safety, is daunting.

    The vaccines available today are unlikely to be available in sufficient quantities to vaccinate the entire population in the near term, which creates the need for nimble, data-driven strategies to optimize limited supplies.

    Analytics can be used to:

    • Identify the location and concentration of priority populations.
    • Monitor the relative adequacy of providers capable of vaccinating critical populations.
    • Measure changes in need and demand patterns to optimize supply-chain strategies.
    • Track community-based transmission and efficacy.

    The storage and transportation of the vaccine is a complex logistical exercise, requiring coordination among governments and providers and the safe transport and storage of vaccines from manufacturers to vaccination sites.

    Using analytics to shape strategy and execution

    Since the pandemic’s beginning, SAS has partnered with customers in using analytics to:

    • Monitor the spread of infection.
    • Model future outbreaks.
    • Uncover relevant scientific literature.
    • Share real-time health insights.
    • Optimize supply chains and medical resources.

    These same analytical strategies can be used for vaccination programs. Why? Because analytics based on trusted data drives the best decisions. Below of some examples of what we mean.

    Develop immediate and long-term vaccination strategies

    SAS can help you create a data-driven strategy to identify and estimate critical populations that will benefit the most people. Governments have struggled to balance the need to create an orderly, risk-driven prioritization strategy while quickly administering all of the doses they have been allocated. Integrating data to calculate the size of prioritized populations in given geographic areas enables a data-driven vaccine allocation strategy that maximizes throughput and minimizes wasted dosages. Locating and estimating the size of these populations will be critical to developing an effective allocation strategy. This complex task can be fraught with technical challenges; for instance, creating an analytically valid estimation that identifies targeted populations across data sources.

    To succeed, governments and health agencies will need to integrate data to identify critical populations, enable populations to be further subset to accommodate unknowns in vaccine supply, and model vaccination impact on priority outcomes. Given the variety of public and private organizations collaborating on this response, the best solution will drive open, transparent communication across diverse agencies.

    Visual analytics is paramount because showing priority population data on maps can also speed strategy development. Using proximity clustering and hot-spotting technology, leaders can identify population densities to ensure adequate vaccine supply. Epidemiological models can help ensure continued situational awareness, so that prioritization and allocation approaches don’t become reliant on point-in-time data, but are instead part of a continuous-learning system that is responsive to on-the-ground changes in the pandemic.

    Monitor vaccination capacity and adverse events

    Identifying and recruiting enough providers to ensure sufficient access to COVID-19 vaccines (especially once supplies increase) will be crucial. SAS has experience working with government health agencies to monitor the adequacy of health care provider networks, a skill set and technology base that can provide agencies with an evidence-driven view of vaccine administration capacity and vaccination goals.

    We work with commercial partners worldwide to augment the public health workforce to meet rising demand for vaccines. Related data such as storage capacity and throughput can be calculated and included for a fuller understanding of network adequacy.

    As more data is collected regarding adverse events, SAS continues to help with health surveillance and research for many national health regulatory agencies today.

    Optimize supply chain strategies

    Health and human service agencies are being asked to allocate vaccine supply based on a range of complex, interrelated factors that include populations served and providers’ capability for storing and refrigeration. Optimizing these distribution strategies while facing fluctuating supplies, evolving need and changing provider enrollments will require a strong data and analytic approach.

    SAS offers end-to-end supply chain analysis to assist agencies in an efficient, coordinated vaccine distribution response. By capturing inventory, demand, capacity and other related data across the distribution chain, you can create models that determine how agencies can optimize allocation strategies while accounting for the dynamic nature of pandemic outbreaks. The outcome is a set of flexible, adaptable plans for vaccination processing, inventory monitoring and distribution.

    Dose administration analytics

    Vaccination administrators must report certain data elements in near-real time (through electronic health records or directly via state immunization information systems). This information is a critical tool in creating rapid-response analytics that can guide decision making and future planning. Unfortunately, long-term underinvestment in our public health IT infrastructure has led to significant data quality challenges and weak reporting capabilities, which collectively prevent a data-driven vaccination strategy.

    Our data management solutions can assist agencies in creating a trusted, consolidated vaccination record. This includes automating tedious and manual processes such as data preparation, data integration and entity resolution to provide analysts more time for treatment and vaccination efforts. With this reconciled vaccination data, SAS can provide analytics to help agencies:

    • Predict evolving resource needs across jurisdictions such as states, regions and countries to optimize allocation strategies.
    • Monitor uptake to help ensure alignment with anticipated need, provider requests and vaccine distributions.
    • Analyze unexpected gaps in vaccination administration to guide outreach and engagement efforts.
    • Anticipate barriers to delivering second doses.
    • Gain insights on changes in susceptibility, rate of transmission, status population immunity, etc.

    Managing a cold chain for biologics

    In the US, the CDC has updated the Vaccine Storage and Handling Toolkit to outline the proper conditions for maintaining an effective COVID-19 vaccine under cold-chain processes. Cold chain is a logistics management process for products that require specific refrigerated temperatures from the point of manufacturing through distribution and storage until the vaccine is administered. But how do you collect data along the chain to ensure product safety? New internet-connected sensors now travel along with the vaccines. Collecting and analyzing data like that allows administrators to monitor, track and optimize distribution strategies in this multi-layered and complex vaccine rollout.

    The path forward

    As you read this, shipping and logistics companies are recording data on vaccine temperature and location.  Governments are rapidly transforming themselves into organizations capable of allocating, distributing and administering vaccines and their necessary components at massive scale. Retailers (pharmacies) are implementing customer contact programs to help track, administer and verify vaccinations.

    The coordination across these various public and private companies is critical for a successful vaccination program. Even though the scale of this operation is historic, the sub-components of the process can be likened to other large, data-driven strategies.

    Author: Steve Kearney

    Source: SAS

  • How to Communicate Challenging Insights from your Market Research  

    How to Communicate Challenging Insights from your Market Research

    When the truth resulting from our study is positive, it can be a joy to communicate those insights. However, research efforts sometimes reveal different results than were expected or hoped for. Although being the bad news bearer is never fun, ultimately it’s these types of scenarios that best make the case for the cruciality of market research. Accurate results, even if disappointing, can inform strategic pivots that can mark detours from disastrous marketing mistakes and place well-informed companies on the path to success.

    Treading Carefully

    Especially if bad news is anticipated, stakeholders may be eager for early access to research data. While it's important to be responsive, it is also wise to proceed with caution. Sharing preliminary previews and toplines may be necessary but ensure you only do so when you have sufficient substantiation to avoid skewed data. The first insights you communicate can often be perceived as “the answer,” so if final data analysis ultimately points to a different conclusion than data previews, this can create confusion and erode trust in the results. Strive to strike a balance between satisfying stakeholder curiosity and protecting data integrity.

    Crystal Clear Seas

    Clarity becomes paramount when presenting research results with negative implications. Ensure your report findings are concise, straightforward, and supported by substantial evidence. A great way to buoy the story is to incorporate verbatim quotes or video clips from interviews or open-ended responses from surveys. These authentic narratives in respondents' own words help illustrate the deeper insights behind the data and lend credibility to your findings.

    Surfacing Amidst The Waves

    While it's essential not to shy away from delivering bad news, it's equally important to highlight any positive aspects present in the data. These glimmers of good allow research stakeholders to come up for air amidst the waves of bad news and can help them better absorb the findings. Identify elements that demonstrate areas of strength, potential, or opportunities for improvement. Presenting positive findings as examples to continue or build upon can help balance the overall perception and encourage a more constructive discussion around the challenging aspects.

    A Lifeboat

    When possible, include a separate "respondent-generated" recommendations for improvement section in your report in addition to your recommendation summary. Spotlighting potential solutions directly from the perspective of the target audience can serve as a lifeboat, creating a hopeful way out amidst a sea of bad news. By incorporating respondents' suggestions and insights, you demonstrate that negative findings can serve as a catalyst for growth and transformation. These recommendations provide a clear way forward for stakeholders to take actionable steps to address the issues raised in the research.

    In The Same Boat

    Delivering research results that convey bad news is a delicate task that can feel like navigating shark-infested waters. By adhering to a strategic approach, market research professionals can effectively share challenging insights while maintaining trust in the findings. Treading carefully as you communicate early data, ensuring clear and well-substantiated findings, highlighting any positives, and buoying your findings with respondent-generated verbatims and recommendations help remove the researcher from the results. Our ultimate goal as researchers is to sail alongside our companies through even the most turbulent waters, providing honest insights that guide them towards improvement, growth, and success.

    Date: July 6, 2023

    Author: Heidi Loften

    Source: Decision Analyst

  • How your organization can establish a results-based data strategy

    How your organization can establish a results-based data strategy

    Money never sleeps and neither does your data. In
    this article, we look at digital transformation: the ways of turning data into new revenue streams and apps that boost income, increase stickiness, and help your company thrive in the world of Big Data. 

    The first waves of digital transformation started decades ago and the ripples of this trend continue to be felt to this day. However, what exactly a digital transformation looks like varies widely from company to company. One common theme among many transformations, however, is trying to make better use of data, whether to build analytic apps to unlock new revenue streams or to make smarter decisions internally (or both).  

    While these are worthwhile applications, one blind spot that many teams charged with these projects share is that they look at the data they have on-hand before figuring out what kind of problems they wish to solve with it. 

    “I recommend starting your data strategy with a right-to-left approach, focusing on the desired business outcomes first, instead of the data, to support those outcomes,” says Charles Holive, Sisense Managing Director of Data Monetization and Strategy Consulting. “And there are primarily three areas that industries across the world look to improve: the top line, the bottom line, and customer satisfaction.”

    Define your desired outcome before you start building

    Every company knows they need to digitally transform in order to survive and excel in the modern era. However, many organizations fail to define their goals for this process before they start, and predictably encounter obstacles or outright failures instead of paving a path for future success.

    Business goals should be defined at the very beginning of the digital transformation in the “right-to-left strategy” that starts by answering this question: What is the organization specifically looking to solve or improve? Understanding the details is key, otherwise “digital transformation” will be merely a corporate buzzword that causes headaches, heartbreaks, and lost money instead of producing measurable improvements.

    From there, rather than trying to accumulate and house the company’s entire dataset, the digital transformation team should identify the specific actionable insights and associated data needed to solve for (and measure) agreed-upon outcomes.

    “Not every dataset is made equal; some are more valuable than the others. So being outcome-focused is a way that can you stack-rank the data sets that are most important. Your team can then begin moving that most-important data into your warehouse.”

    Experiment to guide a winning data strategy

    Just as the waterfall method of software development, the strategy of gathering all the requirements upfront and then building and releasing a complete application, has fallen out of favor for agile methods, the same thing should happen when creating an outcome-first data strategy: Rather than trying to build a complete data warehouse right from the outset, approach data strategy as an “innovation factory.”

    “Identifying the exact data you need to solve a singular problem results in a perfect candidate to go into your warehouse on the first cycle. This is because you know exactly what the business is going to do with that data set,” Charles explains. “It’s powerful because it’s already informing or measuring a specific business outcome.”

    And when this data is warehoused and accessible to business partners to make key decisions, you already have a chance to quickly prove this outcome-first data strategy. You’ve immediately created an experiment to win.

    Another piece of advice that Charles talks about in his “Hacking the Analytic Apps Economy” video series is where the innovation factory should live. Namely, not in a mature business unit, but in an agile, fast-reacting department that reports to a Chief Innovation Officer or similar highly-placed exec. This team can deliver on new ideas quickly and won’t get bogged down in pre-existing frameworks or goals that don’t work for what the new data experiments are trying to achieve.

    Create an innovation factory at your company

    “Creating an innovation factory for your company results in faster innovation. You can do these smaller experiments more cost-efficiently, saving money over the traditional data strategy. This also should help your team prioritize projects for the data warehouse that deliver the greatest value, as opposed to the department that screams the loudest.” 

    And while any experiment can fail, but here are some solid tips to help improve your likelihood of success and to maximize the impact of triumphant experiments: 

    • Start by listening to the frontline employees who use the data to make decisions, this will improve the odds of success for your experiment out of the gate.
    • If your experiment works, find other departments that can benefit from that same data, this is where it is key to have a good semantic layer on top of your data warehouse (courtesy of your data analytics software) so you can repurpose the same dataset for different ends.
    • If your experiment fails, see if you can tweak the dataset or use case to apply elsewhere in the company.

    Regardless, approaching data strategy with a focus on business outcomes will put you on the right course.

    “Everything else in the company is business-centered. It just seems counterintuitive not to approach data strategy in the same way.”

    Author: Jack Cieslak

    Source: Sisense


  • Human actions more important than ever with historically high volumes of data

    Human actions more important than ever with historically high volumes of data

    IDC predicts that our global datasphere: the digital data we create, capture, replicate and consume, will grow from approximately 40 zettabytes of data in 2019 to 175 zettabytes in 2025, and that 60% of this data will be managed by enterprises.

    To both manage and make use of this near-future data deluge, enterprise organizations will increasingly rely on machine learning and AI. But IDC Research Director Chandana Gopal says this doesn’t mean that the importance of humans in deriving insights and decision making will decrease. In fact, the opposite is true.

    'As volumes of data increase, it becomes vitally important to ensure that decision makers understand the context and trust the data and the insights that are being generated by AI/ML, sometimes referred to as thick data', says Gopal in 10 Enterprise Analytics Trends to Watch in 2020.

    In an AI automation framework published by IDC, we state that it is important to evaluate the interaction of humans and machines by asking the following three questions:

    1. Who analyzes the data?
    2. Who decides based on the results of the analysis?
    3. Who acts based on the decision?

    'The answers to the three questions above will guide businesses towards their goal of maximizing the use of data and augmenting the capabilities of humans in effective decision making. There is no doubt that machines are better suited to finding patterns and correlations in vast quantities of data. However, as it is famously said, correlation does not imply causation, and it is up to the human (augmented with ML) to determine why a certain pattern might occur'.

    Training employees to become data literate and conversant with data ethnography should be part of every enterprise organization’s data strategy in 2020 and beyond, advises Gopal. As more and more decisions are informed and made by machines, it’s vital that humans understand the how and why.

    Author: Tricia Morris

    Source: Microstrategy

  • Interpreting market data during the COVID-19 pandemic

    Interpreting market data during the COVID-19 pandemic

    Business and economic activity fluctuates during the course of the year. Some of this fluctuation is idiosyncratic, but much is tied to the time of year and the change of seasons.

    Outdoor construction projects are easier to do in dry, warm weather, so especially in the northern half of the United States, construction spending and housing starts tend to be higher during the second and third calendar quarters than they are during the other two quarters.

    The volume of retail sales for clothing and electronics is higher in December because of seasonal gift-giving. Economy-wide, clothing and electronics merchandisers can expect to record about one-sixth of their annual sales in the month of December.

    In contrast, grocery stores experience much less monthly variance in the level of sales. Employment, especially for contract and temporary workers, also exhibits seasonal changes.

    Identifying “True” Trends for a Business or Economic Sector

    When a businessperson is trying to assess the trend of a company’s sales (or when an economist is trying to assess the health of economic activity), he or she will want to filter out these expected seasonal effects to get a better view of the “true” trends. A quick and easy way to abstract from seasonal effects is to compare current activity with the activity in the same period the previous year.

    When a CEO discusses financials, he or she will compare sales in the recent months to sales in the same calendar months a year and two years earlier. When the CFO of a public company presents the latest quarterly results to a group of analysts, he or she will compare revenues not just to the previous quarter, but also to the same quarter the previous year.

    Common Distortions in the Data to Watch

    The “comparable period” approach is not foolproof for abstracting from seasonal effects. For example, the Lunar New Year is a big driver of economic activity in China because it is associated with gift-giving, entertainment, and travel. Measured on the solar calendar, it is a moveable holiday: some years it falls in January, others in February. When comparing economic activity in either of those two months to economic activity a year earlier, one must be aware of when the Lunar New Year occurred in each of those two years.

    For economic statistics, a mathematical variant of the “comparable period” approach is employed to “seasonally adjust” the data. A seasonally adjusted statistic for retail sales, motor vehicle production, or gross domestic product is reported at an annual rate that assumes that the seasonal component of activity was at its typical rate during the period.

    Thus, if December clothing sales are usually twice as high as in November, and sales in December 2020 are exactly twice as high as those in November 2020, the seasonally adjusted level of sales will be identical for November 2020 and December 2020. If sales in December 2020 are more than twice as high as those in November 2020, then the seasonally adjusted level of sales in December 2020 will exceed that in November 2020.

    Seasonal Echoes of the COVID-19 Pandemic

    It is pretty ease to see that the COVID-19 pandemic will disrupt the “comparable period” approach to account for seasonal changes. With much economic activity constrained in the second quarter of 2020 because of lockdowns and shelter-in-place orders, second quarter 2021 levels will certainly be higher, but what will we learn from that?

    Use of comparable periods from 2019 will be one way to try to abstract from the distortions generated by the pandemic. Eagle-eyed analysts will want to pay attention to what comparable periods companies and journalists are using during this year.

    For example, I read a newspaper article last month that discussed growth prospects for electric-powered motor vehicles. To bolster the argument that electric vehicle sales were poised for takeoff, the reporter noted that in China, sales of electric vehicles in January 2021 were six times those of the previous year. The reporter did not mention that sales in January 2020 were depressed by the pandemic, which affected activity in China earlier than in Western Europe and the Western Hemisphere.

    A recent blog post from David Lucca and Jonathan Wright of the New York Fed pointed out that the pandemic will likely disrupt the more sophisticated seasonal adjustment mechanisms as well. The large economic disruption of the 2007-2009 Great Recession led to persistent seasonal echoes in seasonally adjusted data in the following years, Lucca and Wright said.

    Because seasonal adjustment routines use a weighted average of recent comparable periods to estimate the “normal” seasonal relationship, a large disruption to economic activity, such as with the pandemic or the Great Recession, will introduce spurious seasonal patterns in the historical data. In the case of years soon after the end of the Great Recession, seasonally adjusted data for the first quarter of year typically indicated accelerating economic activity that then seemed to decelerate when the seasonally adjusted data for the second quarter became available.

    Making Inferences Will Require Extra Diligence 

    While statistical agencies can and have stepped in with manual adjustments to try to mitigate the problems caused when a large, nonseasonal shock appears, the resulting seasonally adjusted series may not be completely “fixed.” Lucca and Wright argue that "There are no easy answers to seasonal adjustment in this environment. The virus changed the economy and seasonal patterns, in some cases temporarily and perhaps permanently in other cases."

    When available, analysts should also look at unadjusted data to get a handle on how the economy is progressing, but be aware that inferences about “true” behavior will be more difficult to draw for a number of years.

    Author: Thomas Browne

    Source: Market Research Blog

  • Leveraging Product Life cycle Management with analytics

    Leveraging Product Life cycle Management with analytics

    Although many publications compare product data management and product life cycle management — commonly framing the debate as “PDM versus PLM” — that can create confusion. The functionality referred to as a product data management framework is more accurately a subset of a product life cycle management framework. This distinction can sometimes be blurred by the various marketing efforts of software apps.

    The matter may even be further complicated depending on whether the purchasing agent is an engineer or a manager, as there are varying degrees of overlap in PDM and PLM app functionality. An engineer may focus on PDM functionality, while business managers and executives in BI are more likely focused on an enterprise-level solution. An enterprise software selection team should represent all aspects of business to ensure the company makes a decision that will fit the needs of all stakeholders. In particular, buying a standalone PDM framework may leave the enterprise without the broader functionality of a PLM. In this article, we’ll explore the nuances of offerings along the “PDM versus PLM” spectrum. 

    As we clarify the functional distinction between the two types of frameworks and summarize the rationale for choosing one or the other, an exciting new event in the evolution of product life cycle management emerges: The application of machine-learning (ML) based analytics is sharpening PLM frameworks. We’ll begin by defining PDM as an integral part of PLM. We will then see how business intelligence within PLM is leveraging AI analytics toward new levels of insight and productivity.

    What is product data management, exactly?

    Briefly, PDM refers to domain knowledge expert engineering theory and its associated software tools. PDM manages engineering data about a product, particularly computer-aided design (CAD) data, and handles product revisions. Although PDM can be used to handle the design release process, most design release processes are external to PDM framework capabilities. And this defines the most important difference between PDM and PLM: PDM provides one of many inputs to the more comprehensive enterprise PLM process.

    An engineering team commonly uses PDM software to collaborate in planning and organizing product data. Engineers can use a PDM framework to handle revisions and rollbacks, manage orders to change design specifications, and create and send a bill of materials. Engineering teams can save substantial time and remove trial and error by collaborating in a central CAD-based PDM. The CAD component includes product prototypes and models based on actual parts’ manufacturing instructions. Refinement and debugging are a natural component of PDM. A modern PDM system interfaces and shares product data with a variety of other software applications, especially enterprise PLM. Here are some of the many important methods and functions integral to PDM frameworks:

    • Augmented reality and virtual reality for real-time syncing of design data
    • Product CAD file data management 
    • Product revision control as well as revision and rollback history
    • Search functions within CAD files — recycling product specs
    • Scalability tools for spec replication
    • Output data to integrate with PLM, enterprise resource planning (ERP), and materials requirement planning (MRP)
    • Security management to authenticate permission levels for engineers
    • Automated workflows such as engineering change orders

    A favorite function of PDM frameworks among engineers is the ability to collaborate on designs and share comments and feedback. An equipment manufacturer can improve product development and productivity by sharing product data with suppliers and marketing teams. An engineering team typically leverages a PDM through the following aspects of product development:

    • Initial product design
    • Product prototyping
    • Release to manufacturing
    • Engineering change notice (ECN) process

    ECNs are also called ECOs (engineering change orders) and are a practical procedure greatly enhanced and optimized by PDM frameworks. The release to manufacturing, mentioned above, streams naturally to many of the functions of the PLM that we will explore shortly. PDM often releases product CAD files to manufacturing. Bill of materials (BOM) data is streamed to the PLM and ERP. This final step unifies the product development in the global enterprise BI model.

    Important benefits of PDM

    Direct CAD integration is now standard in competitive product development. Computer vision is an important area of AI now enhancing many aspects of CAD in product development. Fast product data searches, product spec replication and recycling, as well as secure data access, can make the difference between a successful product release and a problematic one. Augmented reality 2D or 3D views have now elevated product design to a new level. Automated engineering change orders are likewise an industry standard that PDM frameworks have brought to automated fruition. Many previously disparate functions are now central to PDM, including:

    • Revision control
    • BOM management
    • Spec recycling and reuse

    Now that we have defined the engineering assistance provided by PDM frameworks, it will be easy to transition toward understanding the enterprise solution of PLM frameworks.

    Full-spectrum product life cycle management

    Now that PDM is defined as a subset functionality of PLM, what is a product life cycle management framework exactly? Briefly, PLM manages all aspects of a product from the initial design phase to product termination. Equivalent enterprise-level apps that tend to be unified within PLMs include ERP, customer relationship management, and manufacturing execution systems frameworks.

    In the product development scope, PLM ideally integrates the other solutions centrally as a product data sharing hub. PLM is therefore envisioned as a business intelligence strategy with the ultimate objective of maximizing profitability. The PLM framework thus seeks to promote innovation in the context of product introduction, product maintenance, and all the way through to a product’s end-of-life planning.

    An ideal PLM framework will track a product through its history, charting designated aspects of production and sales (by way of PDM inputs, service requirements, and spec revisions), all the way to product retirement. Generating vast data stores along the way, the PLM framework is now a primary beneficiary of ML-based data analytics. As such, the PLM framework unifies all business processes and enterprise applications, with the ultimate outcome of uniting people in the achievement of their best possible outcomes. When the PLM is integrated fully, the entire manufacturing supply chain benefits. Run correctly with integrated analytics, PLM optimizes product development, informs manufacturing and production rates for order attainment, and drives accurate marketing campaigns.

    Some ways this manifests include:

    • Tracking for all project development progress in designed phases with milestones, assessments, and projected outcome assurance
    • Assessment of product-related business process performance at a glance
    • Interfacing for all enterprise solutions with streaming product data in real time on development issues, hardware issues, and software QA events
    • Strategic materials sourcing through supplier relationship management
    • Context-driven and role-specific authentication and collaboration to improve data sharing, automated workflows, and BI tools and services toward actionable insights
    • Actionable insights leading to concise decisions through ML-based data analytics on product development projects
    • Flexible reporting based on portfolio management analytics
    • Occupational health awareness and product safety
    • Product and process quality assurance
    • Improved operational control through monitoring of product revisions, resource constraints, and cost factors
    • Interface PDM and engineering processes in product data with other BI processes and the business model on the whole

    Comprehensive benefits of PLM

    Having defined PLM as an enterprise solution for product development, let’s talk about the broad benefits of such systems: First, PLM creates a foundation for streamlined initial product designs and ECN processes. It also helps reduce development time and costs, speeding up time to market.

    Ideally, PLM frameworks tend toward:

    • Product designs that are universally visible throughout the organization, promoting collaboration and BI outcomes 
    • Reduced redundancy; increased design reuse and recycling
    • Efficient sourcing and inventory investment, leading to further improvements in manufacturing productivity outcomes
    • Documentation updates leading to higher standards of QA
    • Inception of BI models that embrace innovative ML models and analytics

    The analytics edge in PLM functionality

    All processes that generate data can be analyzed to produce valuable insights about how to improve the efficiency of those processes. Both human and machine productivity can be dramatically improved with insights gained from models trained with live data from CAD and production facilities. Translating DevOps processes into the product design context is the natural evolutionary next step in PLM. API integrations for third-party analytics platforms like Sisense now make it easy to harness the forecasting power of AI.

    Whatever you’re working on, the right analytics for your PDM or PLM data can help you build better products, services, and experiences that will delight your users and stand the test of time. Backed with insights and the right framework, you’re ready to build boldly and change the world.

    Author: Vandita Manyam

    Source: Sisense

  • Microsoft takes next cybersecurity step

    Microsoft takes next cybersecurity step

    Microsoft just announced they are dropping the password-expiration policies that require periodic password changes in Windows 10 version 1903 and Windows Server version 1903.  Microsoft explains in detail this new change and the rationale behind it, emphasizing that they support layered security and authentication protections beyond passwords but that they cannot express those protections in their baseline.  

    Welcome move

    This is a most welcome step. Forcing users to change their passwords periodically works against security, it means consumers have to write them down to remember them and it does nothing to stop hackers from stealing current passwords. Hackers generally use stolen passwords very quickly, and password complexity does little to prevent use of stolen passwords either, since hackers can just as easily capture or steal a complex password as they can a simple one.

    The time has long passed for organizations to stop relying on interactive passwords that users have to enter altogether. Hopefully this move by Microsoft will help move the transition to more secure forms of authentication. Finally a big tech company (that manages much of our daily authentication) is using independent reasoned thinking rather than going along with the crowd mentality when the crowd’s less secure password management practices are, however counterintuitive, less secure.

    Alternative authentication forms and decentralized identity (DID)

    Biometrics on their own can also be hacked. So can one time Passwords, especially those that use SMS and other authentication methods where man-in-the middle or man-in-the browser attacks are possible. What is more secure (and private) is another method Microsoft and many other organizations are starting to support: Decentralized Identities, where users control their own identity and authentication information.

    Using this method, the user’s credential and identity data is maintained in a hardened enclave only accessible to the user using their own private key that is typically unlocked using the user’s private mobile phone and optionally another authentication factor. In the end, the consumer just gets a notice from the site they are trying to log into to confirm the log in on their mobile phone (or other device) by just clicking 'yes' (to the login request) or additionally and optionally by using a biometric, e.g. a fingerprint or an iris scan.

    The bottom line is there is layered user authentication and the user doesn’t have to remember or enter an insecure password. And most importantly the user owns their own secured credential and identity data and no one can access it without user permission.

    Decentralized identities, the path to individual control

    DIDs are supported by many organizations today. Most (but not all) mega tech companies are joining the move to standardize DID technology. The companies not joining are generally the ones that continue to earn a living by monetizing consumer data, largely through advertising and data resell activities.  Adding fuel to the fire, some of these companies have an abysmal record when it comes to securing consumer data.

    Hopefully consumers will start protesting the monetization of their data by adopting DID as an authentication mechanism. It’s certainly a chicken and egg problem but there is gradual adoption across sectors. For example, even the Bitcoin network just started accepting DIDs, and British Columbia in Canada has also implemented them for small business identification.

    Web 3.0

    For sure, I will gladly sign up for a DID as soon as someone asks me too. I really am at my limit in tolerating password management policies. And I’m even more tired of being subject to continuous massive data breaches that steal my most personal and sensitive information, just because I live and transact.

    I don’t think anything else short of a massive re-architecting of the web and how we manage identity data will solve all these problems of data breaches and consumer data monetization and abuse.

    Author: Avivah Litan

    Source: Gartner

  • MicroStrategy: Take your business to the next level with machine learning

    MicroStrategy: Take your business to the next level with machine learning

    It’s been nearly 22 years since history was made across a chess board. The place was New York City, and the event was Game 6 of a series of games between IBM’s “Deep Blue” and the renowned world champion Garry Kasparov. It was the first time ever a computer had defeated a player of that caliber in a multi-game scenario, and it kicked off a wave of innovation that’s been methodically working its way into the modern enterprise.

    Deep Blue was a formidable opponent because of its brute force approach to chess. In a game where luck is entirely removed from the equation, it could run a search algorithm on a massive scale to evaluate move, discarding candidate moves once they proved to be less valuable than a previously examined and still available option. This giant decision tree powered the computer to a winning position in just 19 moves with Kasparov resigning.

    As impressive as Deep Blue was back then, present-day computing capabilities are much stronger, by orders of magnitude, inspired by the neural network of the human brain. Data scientists create inputs and define outputs detect previously indecipherable patterns, important variables that influence games, and ultimately, the next move to take.

    Models can also continue to ‘learn’ from playing different scenarios and then update the model through a process called ‘reinforcement learning’ (as the Go-playing AlphaZero program does). The result of this? The ability to process millions of scenarios in a fraction of a second to determine the best possible action, with implications far beyond the gameboard.

    Integrating machine learning models into your business workflows comes with its challenges: business analysts are typically unfamiliar with machine learning methods and/or lack the coding skills necessary to create viable models; integration issues with third-party BI software may be a nonstarter; and the need for governed data to avoid incorrectly trained models is a barrier to success.

    As a possible solution, one could use MicroStrategy as a unified platform for creating and deploying data science and machine learning models. With APIs and connectors to hundreds of data sources, analysts and data scientists can pull in trusted data. And when using the R integration pack, business analysts can produce predictive analytics without coding knowledge and disseminate those results throughout their organization.

    The use cases are already coming in as industry leaders put this technology to work. As one example, a large governmental organization reduced employee attrition by 10% using machine learning, R, and MicroStrategy.

    Author: Neil Routman

    Source: MicroStrategy

  • More and more organizations are basing their actions on their data

    More and more organizations are basing their actions on their data

    Many corporations collect data but don't end up using it to inform business decisions. This has started to shift.

    All in all, 2020 will go down as one of the most challenging and impactful years in history. It will also be known as one of the most transformative, with companies and individuals adjusting quickly to the new normal in both work and play, with a 'socially distant' way of life changing how people interact and communicate.

    ven amidst the chaos, we saw an influx of existing technologies finding new industry opportunities, such as videoconferencing tools, streaming platforms such as Netflix, telehealth applications, EdTech platforms, and cybersecurity, to name a few. All of these technologies are powered by one fundamental thing, yet this entity isn't being tapped to its full potential by SMBs and enterprises alike.

    That thing is data, collected by companies with the intent to inform business decisions and better understand and serve their customers. However, from what I have seen, more than 80 percent of data that businesses generate goes unused. This will drastically change in the next three years, with the majority of the data consumed being put to use.

    What's driving this trend

    Data generation was already a hot topic prior to the COVID-19 pandemic with a projected 59 zettabytes (ZB) of data created, captured, and copied over the last year according to IDC. This trend has only accelerated with the pandemic as companies are fast-tracking digital transformation initiatives. Adding to this, the ongoing health crisis is resulting in the avoidance of face-to-face interactions during the workday, causing digital interactions to increase tenfold. This has created even more data through connectivity tools and applications.

    Companies have realized that analyzing this data can help leaders make better-informed decisions rather than relying on gut feeling. Data has become so important to companies' success that according to Gartner, by 2022, 90 percent of today's corporate strategies will unequivocally list information as a critical enterprise asset and analytics as an essential competency. Leading organizations know that in order to drive success in their industry, they have to leverage data and analytics as a competitive differentiator, fueling operational efficiencies and innovation.

    Setting up for success

    Though the majority of data collected by businesses currently goes to waste, there are more tools emerging to help companies unify consumed data, automate insights, and apply machine learning to better leverage data to meet business goals.

    First, it's important to take a step back to evaluate the purpose and end goals here. Collecting data for the sake of having it won't get anyone very far. Companies need to identify the issues or opportunities associated with the data collection. In other words, they need to know what they're going to do with every single piece of data collected.

    To determine the end goals, start by analyzing and accessing different types of data collected to determine if it was beneficial to the desired outcome or has the potential to be but wasn't leveraged. This will help identify any holes where other data should be tracked. This will also help hone the focus on the more important data sets to integrate and normalize, ultimately making data analysis a more painless process that produces more usable information.

    Next, make sure the data is useful - that it's standardized, integrated across as few tech platforms as possible (i.e., not a different platform for every department or every function), and that the collection of specific data follows company rules and industry regulations.

    Finally, use data in new ways. Once your organization has integrated data and technology solutions, the most meaningful insights can often only be found using multidimensional analytics dashboards that take data from two previously siloed functions to understand how pulling a lever in one area affects costs or efficiencies in another.

    Using data to streamline business processes and lower costs

    One industry that's collecting data and using it efficiently to optimize business processes is the telematics industry. Before the digital transformation era, fleet managers and drivers had to rely on paper forms for vehicle inspections or logging hours of service. Now, many telematics-driven companies are relying on connected operations solutions to collect, unify, and analyze data for a variety of tasks such as improving fuel management, driver safety, optimized routing, systematic compliance, and preventive maintenance.

    We have seen fleets with hundreds of assets switch from other out-of-the box telematics solutions, to a more business-focused solution, which allows them to leverage data insights from their connected operations and realize meaningful improvements and costs savings. One such client recently reported saving $800,000 annually in field labor costs, an annual savings of $475,000 in fleet maintenance and repairs, and they've seen compliance with their overdue maintenance reduction initiative go from around 60 percent to 97 percent. It's clear that data contains the answers to an organization's challenges or goals. The question remains whether the organization has the tools to unearth the insights hidden in its data.

    Empowering decision makers through data

    The most important piece to the entire data chain is ensuring the right data insights get into the hands of decision makers at the right time. What use is accurate, analyzed data if it goes unused - as most of today's data does? Including the right stakeholders from across all business functions in the data conversations may unearth current challenges, as well as new opportunities that may have not otherwise been known. This is a step that many companies are now recognizing as crucial for success, which is why we will see more data consumed and put to use over the next three years.

    If they haven't already, executives and decision-makers at all levels should start looking at business operations through a data-centric lens. Companies that recognize and act on the fact that their competitive edge and profit growth lies in the insights hidden in their operational data can expect to see immediate ROI on their efforts to mine their data for golden insights. If they're not doing something about this now, they might just be in a race to the bottom.

    Author: Ryan Wilkinson

    Source: TDWI

  • Moving business intelligence, data and insights forward with graph technology

    Moving business intelligence, data and insights forward with graph technology

    Hospitals are one of the best examples to spotlight the complexities of unstructured data. From physicians’ notes in EHRs, emails, text files, photos, videos, and other files, the majority of patient data cannot be read by machines. Research firm IDC estimates that upwards of 80% of data will be unstructured, growing from 33 zettabytes in 2018 to 175 zettabytes by 2025. This example demonstrates a huge challenge in dealing with unstructured data and analyzing it when it is stored across disparate systems. The health care industry is just one prominent example of a sector awash with unstructured information that could have critical clinical and business-related data insights. That’s where graph technology comes in.

    The (Unstructured) Data Deluge 

    Graphs are one way to contextualize and explain data. Graphs themselves can be particularly large, with data sizes of 10 to 100 terabytes. As such, graph data has been particularly beneficial when data is large, continually evolving, and rife with high-value relationships. 

    Knowledge graphs, which make connections between seemingly disparate sources to provide specific business insights – have been in existence for many decades. Historically, knowledge graphs have been associated with search engines such as Google to enhance and hasten its search results, as well as from social media networks such as LinkedIn and Facebook to understand their users and surface relevant content (including relevant ads and common friend connections). 

    In recent years, graph computing companies have flourished exponentially, with the benefits of graph databases, analytics, and AI trickling down from the big tech titans to a slew of organizations and industries. Gartner predicts that by 2025, graph technologies will be used in 80% of data and analytics innovations, up from a mere 10% in 2021. This raises the obvious question: Given graph technology’s long legacy, why is it ballooning in demand and popularity now? 

    Barriers to Data Insights 

    One barrier to embracing graphs has been earlier approaches to gleaning insights from unstructured datasets. Traditional graph databases have aimed to address concerns wit relational databases but were not built with analytics in mind. This meant that organizations hit performance limitations when traversing massive knowledge graphs or query processing – even at low latency and scale. 

    Another barrier has been the lack of standardization in graph technology, which has resulted in high costs for any organization looking to move from one legacy database to another. Today, the industry still needs to make strides to cultivate the right tools and open standards, such as providing common libraries to allow data scientists to easily process large-scale graphs. 

    From Niche to Norm 

    For data-forward organizations, there are a few key solutions. A unified platform, which combines graph query, graph analytics, graph mining, and responsive graph AI, can offer unparalleled flexibility and scalability in understanding massive datasets. Such a platform can bring together disparate systems and reduce time to insight – or how long it takes for an analysis of the data to produce actionable feedback. This enables a faster sharing of those insights to facilitate faster decision-making and foster innovation. The rate of insight is important because industries that rely on graph computing can overtly benefit from real-time intelligence, such as monitoring network traffic for suspicious activity and alerting teams when any such activity is discovered. 

    For virtually every industry – from financial service to health care and pharmaceutics – their analytics and intelligence are only as good as their ability to truly understand and action the vast amounts of data they have. Beyond a unified platform, another option is to create metadata on top of the disparate systems and then build a knowledge graph on top of that, called a “data lakehouse.” In this case, the metadata serves to extract information from the data lakehouse in disparate systems and unifies them into a knowledge graph that can be used to provide actionable insights.

    As organizations continue to experience an exponential rise in data, more enterprises will organically amass graphs that have billions of nodes, and hundreds of billions of edges. The most data-forward organizations will have to build scalable systems that not only reduce time to data insights but address the underlying complexities of unstructured data and legacy architectures.

    Author: Farshid Sabet

    Source: Dataversity

  • Moving your projects to the cloud, but why?

    Moving your projects to the cloud, but why?

    Understanding the cloud main advantages and disadvantages

    In this article, we are going to change the context slightly. In the last articles, we have been talking about data management, the importance of data quality, and business analytics. This time, I am very excited to announce to you that we are going to explore, over the next few weeks, a current trend that will affect all companies in the decade in which we find ourselves: the cloud. I know that the topic cloud is very broad since it has a lot of concepts so we’ll focus on data in the cloud.

    I thinkby now, we have all heard about the cloud and its capabilities but, do you know all the benefits and implications it has? In this first post, I would like to explore the basic concepts of the cloud with you, and in the next few weeks, accompany you on a trip about how we can find relevant insights using the cloud resources.

    First of all, I want you to understand why this post is for you. So, if you are…

    an individual, whether you’re in business or tech, you need to understand these concepts and how the cloud is changing the game.

    a company, you must have a cloud strategy. We are not talking about having your workload 100% migrated to the cloud tomorrow, but you should have a roadmap for the next few years.

    What is cloud computing?

    At this point, I would like to define what cloud computing is. Since 2017 an infinite amount of statements have been distributed over social networks saying:

    ''Cloud computing is just someone’s else computer''

    This false idea has spread over the Internet. I must admit that I had a sticker on my laptop with that slogan a few years ago. But the truth is if you say that, you are not understanding well what cloud computing is. It is true that, reduced to a minimum, cloud computing is about renting compute power from others for your purposes, an infinite world of possibilities has been raised over this idea with implications at all organizational levels of a company.

    Let’s talk about the advantages

    The economy of scale

    As you surely know, today everything is done on a large scale, especially when we talk about the world of data. For this reason, we must be able to operate less expensively and more efficiently when we do things on a large scale. The cloud takes advantage of the economy of scale, allowing our businesses to grow and be more profitable as they grow.


    Another of the many advantages of cloud computing affects the financial level because it changes the spending model. You should understand these spending models well, to know why it is an advantage of the cloud.

    • Capital expenditure (CapEx): they consist of an investment in a fixed asset and then deducting that expense from your tax bill over time. Examples of assets that would fall into this category could be buildings, equipment, or, more specifically, when you buy a server or a data center (On-premise).
    • Operational expenditure (OpEx): can be understood as expenses necessary for the operation of the business. You can deduct this expense from your tax bill in the same year. There are no upfront costs, you pay for what you use.

    Operational expenses enable a pay-as-you-go pricing model, which allows your company to reduce costs and gain flexibility.

    Reduced time-to-market

    Thanks to the cloud, the time-to-market for new products or the growth of the existing ones is reduced.

    If your company, regardless of its size, wants to try a new product, with the cloud you will be able to do so much more agilely, since it allows you to allocate resources in a much faster and more precise way.

    On the other hand, if you already have a product running and want to make it grow to other countries, the cloud will allow you to do it much more efficiently.

    Scalability, elasticity and reliability

    Another advantage of the cloud is closely related to the pay-as-you-go model. In this case, we are talking about scalability and elasticity, which allows your business to constantly adapt to demand. This has two aspects: on the one hand, it prevents you from incurring extra costs when you have wasted infrastructure and, on the other, it allows your business to grow as demand grows, guaranteeing the quality of the service.

    Also, the cloud allows you to increase the reliability of your technology through disaster recovery policies, data replication, or backups.

    Focus on the business

    With the shared responsibility models of cloud providers, you can free yourself from certain responsibilities and put a greater focus on growing your business. There are different cloud models, which we will see below, but I anticipate that depending on the model you choose, the distribution of responsibility will vary.

    It’s not about being carefree. Using technology always carries a great responsibility and many aspects must always be born in mind, especially when we talk about data. However, in any cloud model, there will always be a delegated party to the provider, which will allow you to free yourself to a greater or lesser extent from recurring and costly tasks for your business.


    I believe that security should always have a separate section. Closely related to economies of scale, security solutions are cheaper when deployed on a large scale, and the cloud takes advantage of this. Security is a key element today, being a differentiator for many clients like you. This demand makes cloud providers put special focus on security.

    Finally, and related to the shared responsibility model, depending on the solutions implemented, the cloud provider usually acquires certain maintenance responsibilities such as the installation of updates, application of security patches, or security implementations at the infrastructure level so you don’t have to worry about these tasks.

    But why does nobody talk about the risks?

    There are always two sides. We have talked about the advantages and I am sure that many, if not all, you would already know. I hope that at this point, you have gathered that the cloud offers you great opportunities whether you are a small, medium, or large company.

    Now, why do so few people talk about risks? There is no perfect solution, so I think it is just as important to know the benefits as it is to talk about the risks. When you have all the information on the table, you can make a decision with a much more objective criterion than just seeing a part.

    Provider dependency

    When you use any technology, a link is established between your business and that technology. A dependency is created that can be higher or lower depending on the technology and the function it has in your business. This dependency gives cloud providers greater bargaining power, as switching costs arise that were not so present before. For example, if we use accounting software or a CRM in the cloud, the switching costs are very high, because they perform very core functions in your business.

    The same happens with the infrastructure, if all your technology infrastructure relies on a specific cloud provider, it gives that provider greater control. For example, if the cloud provider decides to change prices and you have all your infrastructure hosted with that provider, you have two options: either you accept the changes or you incur the cost of infrastructure migration.

    Not all services are available everywhere

    Not all cloud providers offer the same services and the same services from one provider are not available worldwide. You may need to use a service offered by a certain provider that is available in a geographic area that interests you. Now, if you need to scale to other geographic regions, that service may not be available and your ability to act will be limited.

    On the other hand, and related to the previous point, the fact that you use a specific service with a certain provider does not imply that should the time come when you need to change providers, you can do it since not all providers have the same catalog of services.

    As you have seen, the cloud has great potential for your business since it allows you to gain agility, reduce time to market and optimise costs, which, with on-premise solutions, will be much more difficult. However, in addition to the advantages, you must always keep in mind the main disadvantages, since dependencies and change costs that were not so present before may well appear.

  • Nederlanders vinden data in hun dagelijks werk essentiëler dan andere Europeanen  

    Nederlanders vinden data in hun dagelijks werk essentiëler dan andere Europeanen

    Drie kwart van Nederlanders geeft aan data essentieel te vinden bij hun dagelijks werk. Hierin loopt Nederland voorop op landen zoals het Verenigd Koninkrijk, Duitsland en Frankrijk. Dit blijkt uit onderzoek van Dataiku, specialist in everyday AI. Het onderzoek geeft aan dat ook bedrijfsgrootte invloed heeft op het belang van data en dat mensen vaak onderschatten welke impact AI heeft op hun werk.
    Van de Nederlandse respondenten ziet 74 procent data als onmisbaar in hun dagelijks werk. In het Verenigd Koninkrijk, Duitsland en Frankrijk is dit respectievelijk 64, 62 en 58 procent. Slechts 7 procent van de Nederlandse respondenten geeft aan data als niet essentieel te ervaren in hun dagelijks werk. Dit tegenover 17, 13 en 15 procent respectievelijk voor het VK, Duitsland en Frankrijk.


    Bedrijfsgrootte blijkt een belangrijke rol te spelen in het belang van data. Bij bedrijven met tussen de 50 en 500 medewerkers wordt data door 86 procent als essentieel ervaren voor het dagelijks werk. Bij grotere bedrijven daalt dit naar 77 procent en bij kleinere bedrijven geeft 68 procent aan het gebruik van data als essentieel te zien voor hun dagelijks werk.
    Jurriaan Krielaart, Sales Director bij Dataiku: “Met name de middelgrote bedrijven zien dus vaker de meerwaarde in het gebruik van data. De realiteit is echter dat data van cruciaal belang is voor bedrijven van alle groottes en zeer waardevol kan zijn, zelfs wanneer het op kleinere schaal wordt uitgevoerd. Of een bedrijf nu 10 of 10000 werknemers heeft, de kans is groot dat het bedrijf met exact dezelfde uitdagingen wordt geconfronteerd op het gebied van acquisitiekosten, klantverloop, verkoopprognoses, logistiek of het veroveren van marktaandeel.”

    Onderschatting impact AI op eigen rol

    Naast deze Nederlandse inzichten komt ook een opmerkelijke Europese bevinding naar boven. Uit het onderzoek blijkt dat ongeveer 45 procent van de werknemers in Europa denkt dat AI in de komende vijf jaar impact zal hebben op de sector waarin ze werkzaam zijn. Desondanks denkt maar 38 procent van de werknemers in Europa dat AI de komende vijf jaar een impact zal hebben op hun eigen functie.
    “Werknemers zijn zich bewust van de impact van AI op hun bedrijf, maar niet op hun eigen werk. Mensen verwachten dat hun branche meer wordt beïnvloed door AI dan hun werkelijke baan. Dit bevestigt wat we ook vaak in de praktijk zien, namelijk dat niet iedereen doorheeft hoe groot de rol van AI in hun dagelijks werk is”, aldus Krielaart. “We moeten mensen beter begeleiden in de impact die AI heeft op dagelijkse werkzaamheden en hen helpen daar de vruchten van te plukken. Het biedt enorm veel kansen om meer AI en data in te zetten in vrijwel elke rol, maar daar moeten mensen wel in geholpen worden om dat potentieel te benutten.”
    Bron: Dataiku
  • Only Half of Companies Actually Use the Competitive Intelligence They Collect

    jan16-26-128244186For more than 30 years, most large corporations worldwide have adopted competitive intelligence (CI) as a way to expedite good decisions. And yet for almost every company that uses CI in their decision-making, there’s another that disregards CI’s mix of industry analysis, rival positions, and market insight to their detriment.

    We recently conducted a survey of CI managers and analysts who’ve been through our training program to see how much their findings influenced major company decisions, and why. We received 236 responses from 21 industries in U.S. and European corporations, from CI-trained analysts in marketing, business development, strategy, R&D, finance, and other fields. They had an average of 6.3 years of experiencing in using CI frameworks and tools, and 62% were from companies with over $1 billion in annual sales revenues.

    We found that 55% of our respondents said that their input on major management decisions made enough difference to improve the decision. But 45% said their CI analysis did not.

    Why did some analysts have their input incorporated, while others didn’t? Our survey suggested several key reasons.

    First, many executives decide on a course of action and then use CI to ratify their choice. When asked, “What percent of your reports do you feel are just ‘confirmatory’ for an executive who already made a decision?” a full one-third of our respondents claimed “high” or “very high.” In these cases, the analysis may just be an obligation to be checked off a list.

    We also ran several simple OLS regression models and tested more than two dozen variables to see if they affected which companies actually allowed their CI analyses to influence their decisions. At the end, we found four variables turned out to be highly significant in explaining the difference in impact.

    1. The analyst was assigned a “sign-off” authority over major decisions. The single most effective way to ensure intelligence is used in any given decision is to give the analyst a say in moving it forward. In practical terms this means the analyst – not just the PowerPoint deck – becomes part of discussions leading to the decision. That is the one area where “intelligent organizations” differ most from others.

    2. Management was open to perspectives that were different from the internal consensus. Management that was more open to different perspective was also more likely to ask the analyst for the “big picture” rather than just the data.

    3. The analyst’s report called for proactive action more than reaction. Most companies are reactive by nature, and a lot of intelligence is about reacting to competitors’ moves. However, the decisions that matter more may well be those that are proactive. When the analyst provided proactive recommendations, the analysis had more of an impact.

    4. The analyst was involved in product launches. We don’t know why analysts in this area felt particularly impactful, but we do know that competitive intelligence is highly popular in tactical areas, and that product launches are an area where companies are most worried about competitors’ responses; successful product launches depend on correctly gauging the response of other players in the market. These include, naturally, customers and competitors, but also the less obvious responses by distribution channels, regulatory authorities, and influencing agents. Lack of insightful anticipation of these reactions — which is where competition analysts have the greatest expertise — leads to many more failures than there should be. Perhaps the analysts involved with product launches are thus given more of a mandate than analysts involved in other kinds of activities.

    None of these steps involves spending millions on the intelligence or hiring legions of analysts. And overall, these four variables explained a respectable 40% of the variability in having an impact on decisions. In terms of magnitude of the effect, the simple “sign off” requirement from management was clearly the leading contributor to explaining variability of impact.

    For these decisions – the ones that were improved by competitive intelligence — CI analysts reported many applications of their insights. While product launches were over-represented, our respondents told us about a wide array of applications for their analyses. They were evenly distributed between pursuing opportunities (46%) and reducing risks (44%), and ran the gamut from product pricing and features, capex investments, manufacturing processes, market expansion, joint ventures, M&A, and more.

    For example, in the pharmaceutical industry, respondents said that use of competitive intelligence had either saved or generated millions through discontinuing ineffective drug development efforts, walking away from bad deals and/or licensing opportunities, or accelerating new drug development based on what competitors were doing. For example, as one told us, “We accelerated our orphan disease program, based on accurate prediction of rival expected entry.”

    A common theme across industries was the smart reallocation of resources. One analyst told us that their company had stopped development on a project that was consuming lots of local resources after the analysis indicated it wouldn’t be effective. They then re-applied those resources to an area with true growth potential — that area is now starting to take off. In a different company, an analysis led to the cancellation of an extremely high-risk R&D program.

    This is not to discount the importance of ratifying a current course of action. In one of our favorite answers to our open-response question, an analyst described how CI had “identified only a single competitor, while determining others did not have the business case to continue a pursuit.” But it’s clear to us from this and other surveys we’ve done that the companies that get the most out of CI use it for a wide array of purposes – and actually let it shape their decisions.

    Source: Harvard Business review

  • Organic food: a niche or the future standard?

    Organic food: a niche or the future standard?

    Organic consumption is booming and has been a rapidly growing industry for over two decades. Although food safety issues, food scandals and environmental considerations have been around for longer than that, the organic market accelerates faster than ever, resulting from increasing transparency in the supply chain and the way consumers are informed. There is a widespread perception among consumers regarding divergent benefits of organic food consumption as opposed to conventional food.

    Let’s define ‘organic’

    Organic food derives its label from the production process. It is certified food produced by methods that comply with standards and legislation of organic farming. Even though these standards vary worldwide, they share the same principles: the avoidance of use of artificial chemicals, antibiotics, hormones, genetically modified organisms(GMOs) and artificial food additives, such as preservatives, sweeteners, coloring, flavoring and monosodium glutamate. Irradiation and industrial solvents are also typically not used in the process of organic food production. The organic food production focuses on the use of natural substances, crop-rotation and nutrient recycling to conserve biodiversity and to promote ecological balance. There is a strong emphasis on animal- and environmentally friendly farming and cultivation methods.

    While an increasing number of processed organic products are on the market nowadays, the most purchased organic products are unprocessed foods, such as fruits and vegetables. Particularly, plant-based foods are well-known and are most often associated with organic food by consumers. Natural fertilizers are used to grow organic crops in order to improve plant growth, while natural pest enemies (f.i. ladybugs and wasps) are used for pest controlment.

    Although in general consumers associate organic food with plant-based products, animal-based products can also be produced organically. In order for these types of products to be labeled organic, raised animals are not allowed antibiotics or hormones to stimulate growth. Additionally, they may not descend from cloned animals and they must have enough space (limited stocking density) and access to fields. This prevents overgrazing, pollution and soil erosion. Feed should furthermore also be organic. Genetically modified feed is prohibited.

    Do consumers prefer organic food?

    There is a significant price difference between organic food and non-organic food because the production and labor costs are higher while yields are lower. This results from the strict framework farmers must comply with in order to be labeled organic. Despite of the price premium, there are numerous reasons why groups of consumers prefer organic food products over their conventional counterparts. The main reason being the perception that organic food is healthier. Organic food is often perceived to have nutritional benefits, like lower nitrate levels or containing more antioxidants, minerals and vitamins than regular food. Since organic crops and plants do not rely on chemical pesticides, they produce more antioxidants. Due to the COVID-19 pandemic, health awareness increased, and this reinforced the health-related driver for consumers in buying organic food. Several studies found benefits of organic foods for consumers’ health, but evidence is mixed. Additional market research is necessary to confirm these health benefits.

    Another key consumer driver for organic food is the environmental perspective. In conventional farming, fertilization, overproduction, and the use of pesticides may negatively affect biodiversity, ecosystems, groundwater, and drinking water supplies. These issues are intended to be minimized in organic farming, which tends to improve soil quality and the conservation of groundwater.

    Other (less dominant) reasons advocating for organic food are animal wellbeing, greater taste (perception), reduce exposure to pesticides and toxins, a desire for natural products, and food safety considerations.

    Summarizing, eating habits are changing because consumers increasingly consider other indicators than taste and price in their choice for food, such as a limited impact on nature and fair products. However, health-related claims regarding the benefits of organic food consumption need more evidence to be substantiated.

    Size of the prize and key organic markets

    The global organic food market has been growing at a CAGR of approximately 10% over last ten years. Strong growth is expected to continue in the coming years, with the forecast that the organic food market will increase three times faster compared to overall food market, indicating an increasing share of organic food in the overall food market. Globally, there is over 71 million hectares certified organic farmland, representing approximately 1.5 percent of total world farmland. Half of the total organic farmland is located in Australia. However, in terms of consumption, the US market and Western-Europe are global leading organic markets.

    In the US, the organic food industry is one of the biggest and fastest growing food sectors with a value of approximately $‎50.1 bn. (~€44.7 bn.) in 2019.

    The European organic retail market experienced strong growth over the last years, growing with a CAGR of9,7% (2011 – 2018). This growth is expected to continue with the same pace in the near future. European market value is estimated at €44.5 bn. in 2019. Leading European countries in terms of per capita expenditure on organic products are Denmark, Sweden and Switzerland. Germany and France are the largest European organic markets but still have a lot of growth potential looking at the relatively low consumption per capita.

    In Asia, both production and consumption are rapidly increasing, with China and India becoming global producers. China and Japan are leading consumers of organic food with the former market mainly being accelerated by food safety crises at the end of the 2000s. Disparity in production and demand in Asia leads to large imports of organic products from Australia, New Zealand, Europe and the US.

    Market development varies across regions

    As mentioned previously, the demand for organic food is forecasted to continue its strong growth in years to come as groups of consumers have a preference for organic products based on a variety of considerations. Production and supply chain challenges limit market size and growth. Farmers experience that it is expensive to shift to organic farming as the investments that must be made, do not directly translate into increasing selling prices. It takes two to three years for farmers to convert from conventional to organic farming and until their products can be sold with a certified organic label. However, although production cannot keep up with the accelerating demand for organic products, the continuously growing demand has led to the introduction of multiple programs across the world supporting farmers to switch to organic agriculture. These programs are initiated by diverse stakeholders such as (agricultural) ministries, governments, traders and food producers. Another reason for proactive support from governments is the lower footprint of organic farming methods compared to conventional methods.

    Despite a continuing forecasted market growth, and environmental, as well as social benefits, this forecasted market growth will not occur proportional across regions. An often-mentioned downside of organic farming is that it is unable to achieve food security in an ever-increasing global population. This issue indicates a clear limit to the potential of organic farming and following this reasoning, development of the organic food market in terms of production differs per region. While organic farming strongly focuses on quality, conventional farming emphasizes high quantity. Organic farming can become an important food production category in more food-secure developed countries whereas in regions with high population pressure and low land availability organic food production seems to be less significant. Also, in countries where food demand is immense, local production and most likely consumption will not be organic.

    Can production keep up with demand?

    The short answer to this frequently raised question is no.

    It is noticed that conversion of organic products generally does not depend on demand but on the availability of organic products. In other words, restricted production limits the market size and growth. For organic production to follow demand, organic farmers must overcome challenges like finding enough reliable resources to produce organically, which currently is the main restriction for market growth. Furthermore, other supply chain issues must be overcome as the current organic supply chain does not allow for faster production growth. Compared to conventional food production, organic products are generally more difficult to produce, as it requires more labor and resources which puts pressure on profitability. Additionally, producers face problems in expanding production without losing the organic aspect. Land availability is also a limiting factor for organic crop cultivation and food production. This is due to the low agricultural intensification of the organic farms for the crop production. Organic farms require significantly more land and achieve lower yields compared to conventional farms.

    Changing market dynamics in developed markets

    It is expected that organic food production continues to grow strongly, especially in more developed and food-secure markets. Particularly when production challenges are overcome, and farmers receive financial support in their shift to organic. In these regions, future organic products are becoming more mainstream in an increasing number of product categories. In less developed markets, the market growth is forecasted to be smaller and more depending on imports.

    Within the US and Western-Europe, which are the two largest organic consumption markets, and both have high food security, a clear consolidation trend is noticed. Over the last decade, the organic market in theUS and Europe was influenced by innovative, smaller organic producers putting pressure on established food players struggling to shift to organic. This mainly accounts for Europe. However, more and more multinationals are jumping in on the organic trend. Especially in the US, larger corporations acquire smaller players to expand their portfolio with organic products. Good examples are Danone (acquired Whitewave) and Amazon (acquired Whole Foods Market). In Europe, this trend is also noticed with players like Unilever (Pukka Herbs ltd and Mãe Terra), Nestlé (Terrafertil) and Wessanen acquiring multiple organic producers. As the market is rationalizing, this consolidation trend will continue in the next years in a surge of large multinationals to expand/extent their portfolio with organic products through M&A activities.

    This consolidation trend also occurs when support programs for farmers are insufficient. In the Netherlands for instance, organic farming and consumption are growing but fewer farmers are switching to organic. This is resulting from the expensive shift to organic and a lack of governmental support. Economies of scale seem to be the answer, which indicates consolidation of organic providers. This information corresponds with Hammer’s findings regarding the farmers’ sentiment on organic farming, measured over four key European markets (The Netherlands,United Kingdom, France, Germany). Hammer found that French dairy farming is most likely to pick up on the organic trend, while Dutch dairy farmers are least likely to switch to organic farming in next ten years.

    What will the future bring?

    The future of organic food looks bright, with growth fueled by health and environmental considerations. Recent M&A activities endorse that the organic segment is hot and in the middle of attention. Underlying dynamics of the market are changing and consolidation in developed markets is expected to continue. There is a lot of potential for significant growth, also in current key markets. The organic food market in key European markets France and Germany for example, could almost double in value when they reach the average expenditure per capita for leading European countries in this field. Production is not able to keep up with organic consumption, and product availability determines organic consumption. This provides possibilities for exporting countries towards developed food-secure markets such as the US and Western-Europe. Concluding, the organic market is far from reaching the tipping point of growth and offers many untapped opportunities through the whole chain.

    Author: Mark Diesveld

    Source: Hammer, market intelligence

  • Qlik: De gevolgen van te lage data vaardigheden

    Qlik: De gevolgen van te lage data vaardigheden

    Er is een kloof ontstaan tussen de datagedreven ambities van organisaties en het daadwerkelijke vermogen van medewerkers om waarde te halen uit data. Dit blijkt uit een nieuw rapport van Accenture en Qlik genaamd 'The Human Impact of Data Literacy', uitgevoerd namens The Data Literacy Project.

    Data is een goudmijn die binnen een bedrijf innovatie en groei kan stimuleren. Echter, zodra werknemers moeite hebben om data te begrijpen, kan dit productiviteit en bedrijfswaarde beïnvloeden. Accenture en Qlik hebben een onderzoek uitgevoerd onder 9.000 werknemers wereldwijd. Hieruit blijkt dat bedrijven elk jaar door stress gemiddeld meer dan vijf werkdagen (ongeveer 43 uur) per werknemer verliezen. Stress die wordt veroorzaakt door informatie-, gegevens- en technologiekwesties. Werknemers stellen bijvoorbeeld hun werk uit of melden zich zelfs ziek. De kosten hiervan lopen wereldwijd in de miljarden: $109,4 miljard in de VS; $15,16 miljard in Japan; $13,17 miljard in het Verenigd Koninkrijk; $10,9 miljard in Frankrijk; $9,4 miljard in Australië; $4,6 miljard in India; $3,7 miljard in Singapore; $3,2 miljard in Zweden; en $23,7 miljard in Duitsland. Een te lage vaardigheid om data te interpreteren en analyseren heeft invloed op het vermogen van organisaties om mee te draaien in een datagedreven economie. Bijna alle ondervraagde werknemers (87%) ziet data als een waardevolle toevoeging, terwijl maar een minderheid daadwerkelijk gebruik maakt van data in de besluitvorming. Slechts een kwart van de ondervraagde werknemers gelooft dat ze volledig vaardig zijn om data effectief te gebruiken. Daarbij geeft 21% van de ondervraagden aan vertrouwen te hebben in hun datavaardigheid: het vermogen om data te lezen, te begrijpen, te bevragen en ermee te werken. Bovendien neemt 37% van de werknemers hun beslissingen op basis van data; bijna de helft (48%) baseert besluiten op hun intuïtie.

    Vaardigheden en training

    Een gebrek aan datavaardigheden vermindert de productiviteit op de werkvloer. Driekwart (74%) van de ondervraagde werknemers geeft aan zich overweldigd of ongelukkig te voelen bij het werken met data. Dit heeft invloed op hun algehele prestaties. Het leidt er bijvoorbeeld toe dat werken met data wordt vermeden: 36% zet alternatieve methoden in. Een ruime meerderheid (61%) vindt dat de overload aan data stress oplevert op de werkvloer. Dat komt verder naar voren in het onderzoek: iets minder dan een derde (31%) van het wereldwijde personeelsbestand geeft aan ten minste één dag ziekteverlof op te nemen, vanwege stress omtrent informatie, data en technologie. 'Niemand twijfelt aan de waarde van data, maar veel bedrijven moeten hun benadering van databeheer, analyse en besluitvorming opnieuw uitvinden. Dit betekent dat hun personeel over de vaardigheden en training moet beschikken die nodig zijn om de nieuwe kansen van data te benutten', zegt Sanjeev Vohra, Group Technology Officer en Global Data Business Lead bij Accenture. 'Datagedreven bedrijven die zich richten op continu leren, zijn productiever en behalen een groot concurrentievoordeel'.

    Handvatten bieden in een datagedreven wereld

    Om succesvol te blijven in de datarevolutie, moeten bedrijfsleiders ervoor zorgen dat hun medewerkers zelfverzekerder en kundiger worden. Datavaardige medewerkers geven sneller aan dat zij het zelfvertrouwen hebben om betere beslissingen te nemen, en worden daardoor ook meer vertrouwd om deze beslissingen te nemen. Training in datavaardigheid zou de productiviteit ten goede komen, gelooft 37% van de werknemers.

    Jordan Morrow, Global Head of Data Literacy bij Qlik en voorzitter van de Data Literacy Project Advisory Board, voegt hieraan toe: 'Ondanks het feit dat bedrijven de waarde van data erkennen voor het succes van hun bedrijf, worstelen de meeste bedrijven nog steeds met het opzetten van teams die die waarde daadwerkelijk kunnen verzilveren. In plaats van te investeren in de zelfredzaamheid van werknemers om met data te werken, is er veel aandacht besteed aan het verlenen van selfservice-toegang tot data.  Het probleem is dat werknemers zonder de juiste training of de benodigde tools niet met data uit de voeten kunnen. Je moet het zien als vissen zonder hengels, aas of netten: iemand naar de waterkant brengen wil nog niet zeggen dat ze dan een vis kunnen vangen'.

    Vijf stappen

    In het rapport 'The Human Impact of Data Literacy' van Qlik en Accenture worden vijf stappen opgesomd die organisaties helpen bij het inrichten van hun datavaardigheidsstrategie, om zo te werken aan een ​​datagedreven personeelsbestand, inclusief duidelijke dataverwachtingen en een cultuur van co-evolutie.
    Qlik en Accenture zijn oprichters van het Data Literacy Project, met als doel datavaardigheden verder te ondersteunen. Deze wereldwijde community stimuleert het voeren van diepgaande discussies en het ontwikkelen van tools die nodig zijn om een ​​zelfverzekerde en succesvolle, datavaardige samenleving te vormen.

    Bron: BI-platform

  • Recognizing the mismatch between your analytics platform and your business

    Recognizing the mismatch between your analytics platform and your business

    It’s no secret that analytics confers a significant competitive advantage on companies that successfully implement BI platforms and drive key decision making with data. Yet, many organizations struggle in this endeavor. So, why aren’t more analytics and BI implementations delivering results? No one believes that you can simply install analytics and BI software and magic will occur. It is understood that a successful implementation requires two other ingredients: people (end users) and processes (collaboration). The magic only happens when you have alignment on all three elements: the right people, the right processes, and the right tools.

    But what if you knew you had the best and brightest on your staff? And what if they were hungry to solve the organization’s most pressing challenges with data? What if the reason the BI implementation was failing was not the users or their willingness to work together, but that they were using the wrong analytics platform? What if the solution chosen as the centerpiece of an analytics strategy was not fit for duty?

    Watch for the signs

    Consider the following scenario: You finally chose the analytics platform that you hoped would propel your organization to success. At first, everything seemed fine. You went through dozens of stakeholder reviews and witnessed countless vendor demos. You spoke to your executive team, ITleaders, and line-of-business managers. You eliminated the platforms that seemed too complicated for the task and the ones that didn’t quite have the horsepower for your enterprise needs. Plus, the CEO loved the attractive visualizations and report templates included out-of-the-box.

    But now you are halfway through the implementation, and you are starting to see the signs that things are not going entirely to plan. You have the feeling that nothing has really changed in the way people go about their work and that the business has not made any significant progress. You look around and begin to feel that the BI application you selected may not have been the best choice. The following are four signs that you may have chosen the wrong platform:

    1. The content tells you answers everyone already knows

    Everybody loves pie charts. And column charts. And scatter plots. Any visualization is fantastic. However, visualizations are simply representations of data, and they often tell you what you already know. For example, say you have a pie chart on a dashboard that shows your top 10 customers by geography. It will wow you at first, but the novelty wears thin when you realize you already knew your top accounts. What you’d like to do is ask the next questions? What’s the year-over-year change in customers? Why am I losing them or keeping them? Can I take my highest performing salespeople and see why they are successful compared to the unsuccessful ones? If your platform gives you attractive charts, but only offers a modicum of analytic depth, you’ll be left hungry for more.

    2. People are not using it

    Imagine that an analyst has a beautiful chart based on data from your accounting system showing product sales over the last three trailing quarters. But the chart doesn’t tell her about profitability in the next 3 months, or the reasons for profitability. It only gives her the obvious answers.

    So, she reviews a separate profit and loss report (usually a grid of figures), cuts and pastes the data into Excel, applies a forecast algorithm, and then plops that into a PowerPoint to share with the VP of sales. Worse yet, she extracts it out of the accounting tool as raw data because the data in the BI platform was both stale and slightly incorrect. In short, she uses anything other than your company’s expensive analytics tool to produce the real insights. If your employees are not using the platform to make decisions, it risks becoming shelfware.

    A provider of well-known BI platform likes to promote its high number of 'active touches'. What’s alarming is that the vendor considers an active touch to be once-a-month use. So, here are a few questions: Is a person actively communicating if they’re only checking their email once per month? Are you considered worldly if you only check the news once per month? Similarly, are your employees 'data-driven' if they’re only checking their analytics once per month? A successful implementation requires active use of data, and people should have a natural need to use it.

    3. Your tool is too simplistic to answer complex business questions; or, it’s too complicated for people to actually use

    You purchased the platform to accelerate speed-to-insight, not slow it down. However, if you find that your platform merely generates visualizations that don’t trigger meaningful action, then your analytics tool lacks sophistication. Data visualizations cannot make decisions for you, they simply provide representations of data. However, if the visualization is inherently unsophisticated, or simply restates the obvious, it’s just a pretty picture. And if the analytics tool doesn’t give you the ability to interrogate the data beyond the static (or lightly interactive) visualizations, or you need expert help to answer the question, that’s a problem. Your users require something more sophisticated if they’re going to use it. Difficult business questions require sophisticated tools.

    Many analytics platforms tools are rudimentary by design in an attempt to cater to the lowest common denominator (the casual user who only lightly consumes information). Yet they alienate the users that want more than just attractive visualizations. Some platforms cater to the 5% of users who demand high-powered analytics, the data scientists among the userbase. However, this yet again alienates the majority of users because the tool is too difficult or time-consuming to learn. Analytics is a continually evolving exercise. You need to be constantly thinking about the next question and the next question after that. And the next question cannot come at a tremendous cost, it cannot be a development project that constrains decisions.

    For an analytics implementation to truly work, it needs to cater to the 80% in the middle group of users. The ideal platform finds that middle ground. It provides you with a friendly UI that the average user can appreciate, but plumbs in sophisticated analytics, with simplicity, so advanced users can explore greater depths of sophistication and answer the tough business questions. The art is activating the 80%, those that need more than nothing, but less than everything.

    4. The confidence in your insights and analysis is low

    Now, more than ever, users need data to inform their decisions, and they need to be able to trust the data. Desktop-based tools allow users to build their own content entirely untethered from the organization, regardless of whether the underlying data or analytics is accurate or not. This causes downstream problems and sows distrust in the integrity of the data. No one can act on information without confidence in the people, processes, and tools. Analytic platforms should provide governance capabilities to manage data from a centrally administered repository so that analysis can be reproducible and defensible. It should provide the means to trace the origins of the data, the techniques used to examine it, and the individuals who prepared the analysis.

    The dangers of picking the wrong analytics platform

    Often, data visualization platforms are purchased when 'analytics' is merely a check box. The platforms may provide the ability to build and show data representations, but they seldom go deep enough. A serious analytics platform lets you and your business users ask the next big question, and the next one after that. And the questions are never simple. If the answer is obvious, they usually don’t need to be asked.

    If you made a purchasing decision with analytics as an afterthought, you will see the signs with time. It could mean that your efforts won’t deliver meaningful value or, worse yet, that your efforts will utterly fail. So, if you are serious about your analytics, then get a serious analytics platform.

    Author: Avi Perez

    Source: Pyramid Analytics

  • Reusing data for ML? Hash your data before you create the train-test split

    Reusing data for ML? Hash your data before you create the train-test split

    The best way to make sure the training and test sets are never mixed while updating the data set.

    Recently, I was reading Aurélien Géron’s Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow (2nd edition) and it made me realize that there might be an issue with the way we approach the train-test split while preparing data for machine learning models. In this article, I quickly demonstrate what the issue is and show an example of how to fix it.

    Illustrating the issue

    I want to say upfront that the issue I mentioned is not always a problem per se and it all depends on the use case. While preparing the data for training and evaluation, we normally split the data using a function such as Scikit-Learn’s train_test_split . To make sure that the results are reproducible, we use the random_state argument, so however many times we split the same data set, we will always get the very same train-test split. And in this sentence lies the potential issue I mentioned before, particularly in the part about the same data set.

    Imagine a case in which you build a model predicting customer churn. You received satisfactory results, your model is already in production and generating value-added for a company. Great work! However, after some time, there might be new patterns among the customers (for example, global pandemic changed the user behavior) or you simply gathered much more data, as more customers joined the company. For any reason, you might want to retrain the model and use the new data for both training and validation.

    And this is exactly when the issue appears. When you use the good old train_test_split on the new data set (all of the old observations + the new ones you gathered since training), there is no guarantee that the observations you trained on in the past will still be used for training, and the same would be true for the test set. I will illustrate this with an example in Python:

    # import the libraries 
    import pandas as pd
    import numpy as np
    from sklearn.model_selection import train_test_split
    from zlib import crc32
    # generate the first DataFrame
    X_1 = pd.DataFrame(data={"variable": np.random.normal(size=1000)})
    # apply the train-test split
    X_1_train, X_1_test = train_test_split(X_1, test_size=0.2, random_state=42)
    # add new observations to the DataFrame
    X_2 = pd.concat([X_1, pd.DataFrame(data={"variable": np.random.normal(size=500)})]).reset_index(drop=True)
    # again, apply the train-test split to the updated DataFrame
    X_2_train, X_2_test = train_test_split(X_2, test_size=0.2, random_state=42)
    # see what is the overlap of indices
    print(f"Train set: {len(set(X_1_train.index).intersection(set(X_2_train.index)))}")
    print(f"Test set: {len(set(X_1_test.index).intersection(set(X_2_test.index)))}")
    # Train set: 669
    # Test set: 59

    First, I generated a DataFrame with 1000 random observations. I applied the 80–20 train-test split using a random_state to ensure the results are reproducible. Then, I created a new DataFrame, by adding 500 observations to the end of the initial DataFrame (resetting the index is important to keep track of the observations in this case!). Once again, I applied the train-test split and then investigated how many observations from the initial sets actually appear in the second ones. For that, I used the handy intersection method of a Python’s set. The answer is 669 out of 800 and 59 out of 200. This clearly shows that the data was reshuffled.

    What are the potential dangers of such an issue? It all depends on the volume of data, but it can happen that in an unfortunate random draw all the new observations will end up in one of the sets, and not help that much with proper model fitting. Even though such a case is unlikely, the more likely cases of uneven distribution among the sets are not that desirable either. Hence, it would be better to evenly distribute the new data to both sets, while keeping the original observations assigned to their respective sets.

    Solving the issue

    So how can we solve this issue? One possibility would be to allocate the observations to the training and test sets based on a certain unique identifier. We can calculate the hash of observations’ identifier using some kind of a hashing function and if the value is smaller than x% of the maximum value, we put that observation into the test set. Otherwise, it belongs to the training set.

    You can see an example solution (based on the one presented by Aurélien Géron in his book) in the following function, which uses the CRC32 algorithm. I will not go into the details of the algorithm, you can read about CRC here. Alternatively, here you can find a good explanation of why CRC32 can very well serve as a hashing function and what drawbacks it has — mostly in terms of security, but that is not a problem for us. The function follows the logic described in the paragraph above, where 2³² is the maximum value of this hashing function:

    def hashed_train_test_split(df, index_col, test_size=0.2):
        Train-test split based on the hash of the unique identifier.
        test_index = df[index_col].apply(lambda x: crc32(np.int64(x)))
        test_index = test_index < test_size * 2**32
        return df.loc[~test_index], df.loc[test_index]

    Note: The function above will work for Python 3. To adjust it for Python 2, we should follow crc32’s documentation and use it as follows: crc32(data) & 0xffffffff.

    Before testing the function in practice, it is really important to mention that you should use a unique and immutable identifier for the hashing function. And for this particular implementation, also a numeric one (though this can be relatively easily extended to include strings as well).

    In our toy example, we can safely use the row ID as a unique identifier, as we only append the new observations at the very end of the initial DataFrame and never delete any rows. However, this is something to be aware of while using this approach for more complex cases. So a good identifier might be the customer’s unique number, as by design those should only increase and there should be no duplicates.

    To confirm that the function is doing what we want it to do, we once again run the test scenario as shown above. This time, for both DataFrames we use the hashed_train_test_split function:

    # create an index column (should be immutable and unique)
    X_1 = X_1.reset_index(drop=False)
    X_2 = X_2.reset_index(drop=False)
    # apply the improved train-test split
    X_1_train_hashed, X_1_test_hashed = hashed_train_test_split(X_1, "index")
    X_2_train_hashed, X_2_test_hashed = hashed_train_test_split(X_2, "index")
    # see what is the overlap of indices
    print(f"Train set: {len(set(X_1_train_hashed.index).intersection(set(X_2_train_hashed.index)))}")
    print(f"Test set: {len(set(X_1_test_hashed.index).intersection(set(X_2_test_hashed.index)))}")
    # Train set: 800
    # Test set: 200

    While using the hashed unique identifier for the allocation, we achieved perfect overlap for both training and test sets.


    In this article, I showed how to use hashing functions to improve the default behavior of training-test split. The described issue is not very apparent for many data scientists, as it mostly occurs in case of retraining the ML models using new and updated data sets. So this is not really something often mentioned in textbooks or one does not come across it while playing with example data sets, even the ones from Kaggle competitions. And I mentioned before, this might not even be an issue for us, as it really depends on the use case. However, I do believe that one should be aware of it and how to fix it if there is such a need.

    Author: Eryk Lewinson

    Source: Towards Data Science

  • Rubrik is data resilience leider in nieuwste Forrester report

    data science rubrik

    Rubrik is data resilience leider in nieuwste Forrester report

    In de nieuwste editie van het Forrester Wave-rapport over Data Resilience Solutions is Rubrik benoemd tot leider. De aanbieder op het gebied van multi-cloud data control kreeg zelfs de hoogste score toegekend op het gebied van strategie.

    Forrester heeft tien vendoren geëvalueerd op basis van veertig criteria, die weer zijn onderverdeeld in drie categorieën: huidige aanbod, strategie en aanwezigheid in de markt. Rubrik behaalde de hoogst mogelijke score op het gebied van strategie en security.

    'Rubrik past bij bedrijven die erop uit zijn om hun data resilience te vereenvoudigen, moderniseren en consolideren', aldus het rapport. Rubrik wordt omschreven als een ‘eenvoudige, intuïtieve en krachtige policy engine die de bescherming van data regelt, ongeacht het soort, de locatie of het doel van de data'.

    Volgens CEO Bipul Sinha van Rubrik laat de erkenning laat zien dat Rubrik goed is gepositioneerd om de transformatie van de data management-markt te leiden. 'Klanten stellen steeds hogere eisen aan data management-oplossingen, die verder gaan dan alleen back-up en recovery. Dat we de hoogste score hebben gekregen op het gebied van strategie bevestigt dat we op de juiste weg zijn om door middel van innovatie steeds beter te voldoen aan de vraag van onze klanten'.

    Bron: BI Platform

  • Seizing opportunities to be successful with enterprise BI

    Seizing opportunities to be successful with enterprise BI

    Even today, enterprise companies are choosing a limited, “departmental” approach to business intelligence (BI) strategy and adoption. They resort to working with multiple vendors, each of which provides what they consider the best technology for an individual department within their companies. It’s a common phenomenon driven largely by rapid changes in the technology market, including growing data footprints and high demand for analytics. Departments within organizations have understandingly sought the best solutions for the most urgent challenges.

    But this approach rarely aligns with real market complexities because an amalgamation of BI solutions doesn’t drive the organizational value that’s possible with a single, open analytics environment. Leading analytics of this kind provide self-service capabilities to business users across departments, connecting each user to a “single version of the truth” through interfaces designed specifically for their roles.

    In this article, I identify the recent industry trends driving enterprise leaders to embrace more sophisticated analytics environments of this kind. I also identify issues with a departmental approach versus the opportunities of a single-platform adoption strategy. You will discover how preparing and unifying business users across departments in this way can improve decision-making and create new strategic opportunities at all levels of the organization.

    New challenges and opportunities with enterprise BI

    At the relative mid-point of 2021, enterprise companies are still working to become more insights-driven and incorporate BI solutions in all varieties of business decision-making. This change movement is driven in part by the COVID-19 pandemic and related or concurrent economic disruptions, all of which are driving peak uncertainties across industries. Among other issues, those disruptions have revealed broad shortcomings in companies’ existing analytics investments.

    These gaps are creating new BI challenges for enterprises, as well as opportunities for those companies who approach BI investments with healthy growth in mind. Now, “85% of decision-makers prioritize the use of data insights, incorporating quantitative information into the decision-making process,” Forrester describes in a January 2021 article. “That growing appetite for data means that companies must get their internal data house in order—breaking down data silos, establishing consistency in definitions and formats, and encouraging enterprise-wide collaboration.”

    Deconstructing the “departmental analytics” mindset

    Again, it’s easy to understand why data leaders tend to approach analytics on a department-by-department basis. “Over the past decade, advances in digital analytics have transformed the way businesses operate,” reports McKinsey in April 2021. “From marketing and pricing to customer service and manufacturing, advanced analytics is now central to many corporate functions.”

    Indeed, each department has its own unique needs, challenges, and goals where analytics can help. In that regard, a siloed analytics approach may yield more immediate results, such as greater locality and quicker decision-making within an individual department. Although settling with a siloed approach is easier in the short term, it will inevitably lead to broader problems and unwanted complexities later—especially in the face of macroeconomic change.

    In other words, the departmental BI investment strategy sets organizations on the wrong path to greater value in the long term. BI transformation must focus on aligning analytics with its people instead. In a single, open analytics environment, individuals thrive because they are working with the same information—albeit within their own individual workflows—to improve their individual teams’ results.

    Breaking down silos in a managed way with open analytics

    Visualize a single enterprise company. Its marketing and operations departments make decisions using their own analytics solutions. Their numbers conflict as a result. Rather than drive value, these teams will spend the extent of their time working together determining whose numbers are correct.

    The benefits of roles-based data access to a single system outpace the benefits of a departmental approach. Marketers can embrace strategic decision-making using insights they trust from a broader set of data sources, and operations can access those same resources to improve core processes in ways that are accurate and consistent with those of marketing and other teams.

    Although these departments differ in purpose, their perception of reality is both shared and accurate, driving organizational value. When departmental leaders and executives meet for strategic decision-making, they can “unearth growth opportunities that would otherwise be hard to spot,” McKinsey describes.

    The long-term benefits of a single-environment approach

    “Trusted data means trusted business, but old-school data governance is not the answer,” Forrester shared in April 2021. “[Instead], improve trust, ensure compliance, and accelerate data value.”

    To get started, companies can identify BI tools that scale without locking departments and teams into proprietary layers. Instead, the right system will be open and able to communicate with other existing systems. Critically, it will also be accessible for users in multiple departments—an “umbrella” system that connects to other enterprise systems that individual departments or teams require.

    Once established, non-technical employees of all types and at all levels of the organization can begin accessing reliable analytics and make data-driven decisions in their own unique capacities. Best of all, they will all reference the same, sophisticated resources, and they will be free from the “constraints with multi-model data platforms,” as Forrester described in April 2021. Their companies will be “ready to support integrated, real-time, and self-service analytics” instead.

    Future-proofing your analytics and BI capacities

    Your company has an opportunity to get a head start on the strategic value driven by modern BI investments. “Relative to most corporate functions, strategy has not yet captured the benefits of advanced analytics, missing out on potentially critical insights,” McKinsey described in April 2021. “By tapping these technologies to complement the creativity of your team, you can materially improve your strategic outcomes.” But you need the right enterprise analytics technology for your BI transformation to succeed.

    Author: Omri Kohl

    Source: Pyramid Analytics

  • Selection bias: what is it and how to keep it out of your market research?

    Selection bias: what is it and how to keep it out of your market research?

    What Is Selection Bias?

    Selection bias occurs when research participants are not randomly selected. This leads to errors in research results, which negatively impacts the validity of research findings because the sample does not accurately reflect the population of interest.

    This blog post discusses a real-life example of selection bias, presents a type of selection bias, and some tips on how to avoid selection bias in your research.

    Selection Bias in Politics: Truman vs Dewey

    Selection bias is very common in research. It is so common that it has been seen in political settings for example the 1948 US presidential race between Turman and Dewey. During the race, a nationwide political telephone survey was conducted to determine who the majority of the US voting population were going to vote for. The political research results implied that Dewey would win the race by a significant majority of votes. Even the Chicago Tribune printed a newspaper article with the headline “Dewey Defeats Truman”, suggesting Dewey had won the election. However, this was not the case when the election results were released. As you might have guessed, Truman defeated Dewey.

    But what went wrong in the political research study? The telephone was still relatively new in 1948, therefore it came at a cost which meant only a minority of wealthy families owned telephones in their homes at the time. As it turned out, high-income individuals formed the bulk of Dewey’s voter base whereas middle to low-income voters were more likely to vote for Truman. The researchers failed to take this into account when selecting their data collection method. This thus resulted in a skewed response data in favour of Dewey, which led to inaccurate and unrepresentative insights.

    A Type of Sample Selection Bias: Self-Selection Bias

    There are different types of selection bias in sampling. Self-selection bias (also known as volunteer bias) is a common type of selection bias. Like selection bias, self-selection bias occurs when respondents are demographically or behaviourally different to the population of interest. However, with self-selection bias research respondents can decide on their own whether to participate in the research rather than being pre-selected into a sample by researchers.

    The problem with self-selection bias is that certain groups of individuals with self-selecting characteristics may be attracted to taking part in particular studies. For example, research has shown that individuals that have sensation-seeking or thrill-seeking personalities are more likely to volunteer for certain research studies. This would skew the results of the research if it were to be studying these traits, making the research findings unreliable and ungeneralizable to the population.

    How Reliable Are TripAdvisor Reviews?

    Self-selection bias is so common that it can be seen in online reviews. This can be seen in a study conducted using TripAdvisor hotel reviews across four states in the US. This study compared hotel-prompted online reviews with self-motivated online reviews. The study is framed using the Social Exchange Theory (SET). SET is a psychology theory that describes all social interactions as a function of perceived cost and reward. Costs refer to factors that inhibit one from acting in a certain way. Rewards refer to internal motivations that drive individuals to behave in a certain way. In the case of the TripAdvisor reviews, reward may be the satisfaction reviewers get from leaving a review and sharing their experiences, and the cost could be the burden of creating a TripAdvisor account if they do not have one already.

    The researchers found that consumers’ decision to post an online review is associated with their level of satisfaction. Those who had bad or extreme experiences are more likely to post a review than those who had mild or positive experiences. This shows that there are specific characteristics that encourages self-selection in consumers which can skew ratings/reviews. In this case, those specific characteristics are bad or extreme experiences with the hotels where consumers stayed.

    Consumers that were self-motivated into leaving a review did so because their perceived gain is much greater than the perceived cost of leaving a review. This poses a risk of the pool of reviews being dominated by consumer unrepresentative experiences and views of these hotels. This could ultimately have negative implications for the reputation of these hotels and can act as a deterrence for potential consumers.

    How To Avoid Selection Bias

    Avoiding selection bias can be rather difficult, however here are some tips on how to avoid it where possible:

    • Use random methods of selecting the sample from the population of interest.

    • Ensure that key characteristics of the sample reflect the characteristics of the population of interest as much as possible.

    • Ensure that your data collection method is appropriate for the type of respondents needed for research.

    • If after reviewing your data, you realise that some groups are overrepresented while other groups are unrepresented, this bias can be resolved by assigning weights to the misrepresented subgroups. Applying a weighted average to your data helps to take into account the proportional relevance of each subgroup. This can lead to results that more accurately reflect the study population’ s demographics.

    Author: Jacqueline Oke

    Source: B2B International

  • Solutions to help you deal with heterogeneous data sources

    Solutions to help you deal with heterogeneous data sources

    With enterprise data pouring in from different sources; CRM systems, web applications, databases, files, etc., streamlining data processes is a significant challenge as it requires integrating heterogeneous data streams. In such a scenario, standardizing data becomes a pre-requisite for effective and accurate data analysis. The absence of the right integration strategy will give rise to application-specific and intradepartmental data silos, which can hinder productivity and delay results.

    Consolidating data from disparate structured, unstructured, and semi-structured sources can be complex. A survey conducted by Gartner revealed that one-third of respondents consider 'integrating multiple data sources' as one of the top four integration challenges.

    Understanding the common issues faced during this process can help enterprises successfully counteract them. Here are three challenges generally faced by organizations when integrating heterogeneous data sources, as well as ways to resolve them:

    Data extraction

    Challenge: Pulling source data is the first step in the integration process. But it can be complicated and time-consuming if data sources have different formats, structures, and types. Moreover, once the data is extracted, it needs to be transformed to make it compatible with the destination system before integration.

    Solution: The best way to go about this is to create a list of sources that your organization deals with regularly. Look for an integration tool that supports extraction from all these sources. Preferably, go with a tool that supports structured, unstructured, and semi-structured sources to simplify and streamline the extraction process.

    Data integrity

    Challenge: Data Quality is a primary concern in every data integration strategy. Poor data quality can be a compounding problem that can affect the entire integration cycle. Processing invalid or incorrect data can lead to faulty analytics, which if passed downstream, can corrupt results.

    Solution: To ensure that correct and accurate data goes into the data pipeline, create a data quality management plan before starting the project. Outlining these steps guarantees that bad data is kept out of every step of the data pipeline, from development to processing.


    Challenge: Data heterogeneity leads to the inflow of data from diverse sources into a unified system, which can ultimately lead to exponential growth in data volume. To tackle this challenge, organizations need to employ a robust integration solution that has the features to handle high volume and disparity in data without compromising on performance.

    Solution: Anticipating the extent of growth in enterprise data can help organizations select the right integration solution that meets their scalability and diversity requirements. Integrating one data point at a time is beneficial in this scenario. Evaluating the value of each data point with respect to the overall integration strategy can help prioritize and plan. Say that an enterprise wants to consolidate data from three different sources: Salesforce, SQL Server, and Excel files. The data within each system can be categorized into unique datasets, such as sales, customer information, and financial data. Prioritizing and integrating these datasets one at a time can help organizations gradually scale data processes.

    Author: Ibrahim Surani

    Source: Dataversity

  • Stijging privacyklachten over persoonlijke data in de eerste helft van 2019

    Stijging privacyklachten over persoonlijke data in de eerste helft van 2019

    Het aantal privacyklachten blijft sterk toenemen. Ruim 15.000 mensen hebben in de eerste zes maanden van dit jaar een klacht ingediend bij de Autoriteit Persoonsgegevens (AP), zo laat de toezichthouder weten. Dat is bijna 60% meer dan in de tweede helft van 2018.

    Mensen lopen volgens de AP vast bij een inzageverzoek of het verzoek om verwijdering van hun gegevens. Dit speelt vooral bij zakelijke dienstverleners, zoals energieleveranciers en de detailhandel.

    15.313 mensen hebben in het eerste zes maanden van 2019 een privacyklacht ingediend bij de AP. Ook het aantal internationale klachten is sterk toegenomen. Een verklaring voor het grote aantal klachten is dat de mogelijkheid om privacyklachten in te dienen nieuw is in Nederland en steeds bekender wordt.

    De AP heeft in de eerste helft van 2019 ruim 10.000 klachten afgerond. Veel telefonische klachten konden direct naar tevredenheid worden behandeld. In veel andere gevallen hebben medewerkers van de AP mensen op weg geholpen om zelf een klacht in te dienen bij de organisatie waarover de klacht gaat. Er lopen 68 onderzoeken naar aanleiding van een veelvoud aan klachten.

    Zakelijke dienstverleners (46%), de overheid (14%) en de IT-sector (13%) zijn de sectoren waarover de AP de meeste klachten ontvangt. Bij zakelijke dienstverleners, zoals energieleveranciers en de detailhandel gaan de klachten vooral over privacyrechten van mensen en direct marketing. Bij de overheid komt de rechtmatigheid van gegevensverwerkingen het meest aan de orde.

    Voorzitter Aleid Wolfsen vindt dat er een structurele oplossing moet komen, ‘zodat we de klachten adequaat kunnen blijven behandelen. De bescherming van je privacy is een grondrecht, het mag nooit een wassen neus worden.’

    Iedereen kan sinds 25 mei 2018 een privacyklacht indienen bij de Autoriteit Persoonsgegevens (AP). Dat kan als iemand vermoedt dat zijn of haar persoonsgegevens zijn verwerkt op een manier die in strijd is met de privacywet.

    Bron: Emerce

  • Straffen voor foutief gebruik van data nemen naar verwachting toe

    Straffen voor foutief gebruik van data nemen naar verwachting toe

    Het aantal boetes voor het foutief omgaan met gebruikersgegevens, alsmede de hoogte ervan, zal in de komende jaren stijgen.

    Dat stelt DSA Connect op basis van onderzoek. Meer dan een derde (37%) van de werknemers verwacht dat zowel het aantal als de hoogte van de boetes tegen 2025 zal stijgen. Van de respondenten verwacht zes procent een ‘dramatische stijging’, terwijl drie procent vermoedt dat de cijfers juist zullen dalen in de komende jaren.

    Volgens het onderzoek is een van de belangrijkste redenen voor de stijging het feit dat werknemers over veel meer data beschikken in de toekomst (en het heden). Vorig jaar gaf dertig procent van de werknemers al aan dat zij met meer data werken.


    Als het gaat om het verwerken en opslaan van data, vindt 76 procent van de werknemers dat hun bedrijf het goed doet. Bijna de helft (47%) van de ondervraagde werknemers weet niet of hun bedrijf een beleid kent voor het verwijderen van data.

    “Met ontwikkelingen zoals het Internet of Things (IoT) hebben werkgevers te maken met meer data dan ooit tevoren. Ze hebben ook te maken met een toename van het aantal cyberaanvallen steeds strengere wetgeving rond de bescherming van klantgegevens en hoe ze die gebruiken”, aldus Harry Benham, voorzitter van DSA Connect.

    Volgens DSA Connect moeten werkgevers meer tijd en middelen investeren in het verbeteren van hun strategieën om de omgang met klantgegevens te verbeteren en de kans op cyberaanvallen te verminderen.

    Bron: TechZine


  • Strengthening Analytics with Data Documentation

    Strengthening Analytics with Data Documentation

    Data documentation is a new term used to describe the capture and use of information about your data.  It is used mainly in the context of data transformation, whereby data engineers and analysts can better describe the data models created in data transformation workflows.

    Data documentation is critical to your analytics processes. It helps all personas involved in the data modeling and transformation process share, assist, and participate in the data and analytics engineering process.

    Let’s take a deeper dive into data documentation, explore what makes for good data documentation, and see how a deep set of data documentation helps add greater value to your analytics processes.

    What is Data Documentation?

    At the simplest level, data documentation is information about your data. This information ranges from raw schema information to system-generated information to user-supplied information.

    While many people associate information about your data with data catalogs, data catalogs are a more general-purpose solution that spans all of your data and tends to be in the domain of IT.  If an organization uses an enterprise data catalog, data documentation should further enhance data from the data catalog.

    Data documentation refers to information captured about your data in the data modeling and transformation process. Data documentation is highly specific to the data engineering and analytics processes and is in the domain of data engineering and analytics teams.

    How is Data Documentation Used?

    Data documentation is used throughout your analytics processes, including data engineering, analytics generation, and analytics consumption by the business. Each persona in the process will contribute and use data documentation based on their knowledge about the data and how they participate in the process:

    • Data engineers – This persona tends to know more about the data itself – where it resides, how it is structured and formatted, and how to get it – and less about how the business uses the data. They will document the core information about the data and how it was transformed. They will also use this information when vetting and trouble-shooting models and datasets.
    • Data analysts and scientists – These personas tend to know less about the core data itself but completely understand how the data is incorporated into analytics and how the business would use the data. They will document the data with this type of information: what the data is good for, how it is used, if it is good and trusted, and what analytics are generated from it.
    • Business analysts and teams – These teams will interpret the analytics from the analytics teams to make decisions and resulting actions. The business side needs to understand where the data came from and how it was brought together to best interpret the analytics results. They will consume information captured by the data engineering and analytics teams but will also add information about how they use the data and the business results from the data.

    What Should You Expect for Data Documentation?

    The data documentation in many data transformation tools focuses on the data engineering side of the analytics process to ensure that data workflows are defined and executed properly. This basic form of data documentation is one way these tools help facilitate software development best practices within data engineering.

    Only basic information about the data is captured in these data transformation tools, such as schema information. Any additional information is placed by data engineers within their data modeling and transformation code – SQL – as comments and is used to describe how the data was manipulated for other data engineers to use when determining how to best reuse data models.

    The basic information capture and use in most data transformation tools limit the spread of information, knowledge capture, and knowledge sharing across the broader data, analytics, and business teams. This hinders the overall analytics process, makes analytics teams hesitant to trust data, and could lead to analytics and business teams misinterpreting data.

    As you evaluate data transformation tools, you should look for much broader and deeper data documentation facilities that your extended data, analytics, and business teams can use and participate in the process.  Information that can be captured, supplied, and used should include what is described below.

    Auto-generated documentation and information

    • The technical schema information about the data,
    • The transformations performed both within each model and across the entire data workflow,
    • Deep data profiles at each stage in the data workflow as well as in the end data model delivered to analytics teams,
    • System-defined properties such as owner, create date, created by, last modified date, last modified by, and more,
    • The end to end data lineage for any data workflow from raw data to the final consumed data model, and
    • Auditing and status information such as when data workflows are run, and data models have been generated.

    User-supplied information

    • Descriptions that can be applied at the field level, data model level, and entire data workflow level,
    • Tags that can be used for a standardized means to label datasets for what the data contains to how it is used,
    • Custom properties that allow analytics and business users to add business-level properties to the data,
    • Status and certification fields that have specific purposes of adding trust levels to the data such as status (live or in-dev) or certified,
    • Business metadata that allows analytics and business teams to describe data in their terms, and
    • Comments that allow the entire team to add ad-hoc information about the data and communicate effectively in the data engineering process.

    Let’s explore how this broader and deeper set of data documentation positively impacts your analytics processes.

    Collaboration and Knowledge-sharing

    The broader and deeper data documentation described above helps the extended team involved in the analytics process to better collaborate and share the knowledge each has with the rest of the team. This level of collaboration allows the broader, diverse team to:

    • Efficiently handoff models or components between members at various phases,
    • Contribute and use their skills in the most effective manner,
    • Provide and share knowledge for more effective reuse of models and promote proper use of the data in analytics,
    • Crowdsourcing tasks such as testing, auditing, and governance.

    Beyond making the process efficient and increasing team productivity, a collaborative data transformation workflow eliminates manual handoffs and misinterpretation of requirements. This adds one more valuable benefit: it eliminates errors in the data transformations and ensures models get done right the first time.


    When specific analytics team members are waiting for data engineering to complete a project and deliver analytics-ready datasets, they are typically involved in the process and receive a handoff of the datasets. But what about the rest of the analytics team? Perhaps they can use these new datasets as well.

    Your data modeling and transformation tool should have a rich, Google-like faceted search capability that allows any team member to search for datasets across ALL the information in the broad and deep data documentation.  This allows:

    • Analysts to easily discover what datasets are out there, how they can use these datasets, and quickly determine if datasets apply to the analytics problem they are currently trying to solve,
    • Data engineers to easily find data workflows and data models created by other data engineers to determine if they may already solve the problem they are tasked with or if they can reuse them in their current project, and
    • Business teams to discover the datasets used in the analytics they are consuming for complete transparency and to best interpret the results.

    Facilitating Data Literacy and Strong Analytics

    The broader and deeper data documentation we have described here can be used as a lynchpin for facilitating greater data literacy. This happens across all four personas:

    • Data engineers – the data documentation information provided by the downstream consumers of the data workflows allows data engineering teams to have greater knowledge of how data is used and helps them get greater context into their future projects,
    • Analysts – the information provided by data engineers, other analysts, and business teams allows analysts to gain a better understanding of how to use data and produce faster and more meaningful analytics,
    • Data scientists – they can use the information provided about the data to best understand the best form and fit for their AI and ML projects for faster execution of projects and highly accurate models, and
    • Business teams – these teams can use the information to increase the overall understanding of the datasets used to increase their trust in the results and perform fast, decisive actions based on the analytics.

    Wrap Up

    Your data documentation should be better than basic schema information and comments left by data engineers in their SQL code.  Everyone involved in the analytics process – data engineers, analytics producers, and analytics consumers – all have knowledge and information about the data that should be captured and shared across the entire team that helps everyone.

    Using a data transformation tool that provides the richer data documentation we’ve described here delivers a faster analytics process, fosters collaboration and easy discoverability, and promotes greater data literacy.  This leads to greater and better use of your data, strong and accurate analytics and data science, highly trusted results, and more decisive actions by the business.

    Author: John Morrell

    Source: Datameer

  • TDWI interview: the future of data and analytics

    TDWI interview: the future of data and analytics

    From migration struggles to vendor lock-in, we look at some of the most challenging enterprise problems and changes ahead with Raj Verma of MemSQL.

    Raj Verma is co-CEO at MemSQL. He brings more than 25 years of global experience in enterprise software and operating at scale. He was instrumental in growing TIBCO software and has served as CMO, EVP global sales, and COO.

    We asked Mr. Verma to tell us what lies ahead for enterprises in everything from data strategies to data architectures and management.

    What technology or methodology must be part of an enterprise's data or analytics strategy if it wants to be competitive today? Why?

    Raj Verma: Enterprises need to have the information to make decisions in the window of opportunity provided. Data is increasing by the second, and the useful life for data is diminishing. Any company that has postponed digitization is regretting it today. Having a strategy that is real time is what the current business environment requires. Some would say real time is not enough because you have to be predictive in understanding and determining how your customers or your resources are going to react to certain events that are likely to happen.

    Data strategies that do not have vendor lock-in will do well. The multi-billion-dollar empires in enterprise software are built on holding enterprises for ransom. To stay competitive, enterprises must avoid being locked in and subservient to one vendor. There needs to be an exit button enabling enterprises to leave and take their data out at any time. Having a good technology that is also philosophically aligned with their own organization is a best practice I recommend.

    What one emerging technology are you most excited about and think has the greatest potential? What's so special about this technology?

    Amino therapy is going to be game-changing. It will be phenomenal for physical and mental diseases. Anything we can do to hasten those research projects will be extraordinary.

    From a data perspective, I'm impressed with technology that has been built on first principles. First principles thinking means taking a fresh look at a problem and, with the latest research and technologies at hand, thinking through a better way to solve it, from the ground up.

    First-principles thinking simplifies seemingly complex problems and unleashes a wide range of creative possibilities. Technologies that were built on first principles allow companies to avoid lock-in and give a real-time view of what is happening within their organizations. They are also inherently flexible, so they are easier to deploy in new environments, such as hybrid cloud and multi-cloud.

    For databases and analytics, first principles thinking means determining what you want your database to do. Today, people want to work with the newest data, not just historical data, and they need to answer queries generated by AI programs and machine learning models quickly.

    To meet these new requirements, you need the conceptually simple, inherently scalable architecture. That's why we call MemSQL the Database of Now! Our technology is being used by large telecommunications providers to develop heat maps for regions with large COVID-19 infection rates to see where people are congregating and point out areas to avoid. It's helping to get medical supplies to the hospitals and frontline workers that need these products most. This requires technology that is based on first principles.

    What is the single biggest challenge enterprises face today? How do most enterprises respond (and is it working)?

    The biggest challenge that large enterprises of today face is how to migrate off their legacy platforms in an efficient, economical, and agile manner and align to new business realities. Competition for larger enterprises is only a click away. No one is safe in today's economy.

    The companies that do the hard and heavy lifting to respond to that will be surprised. Enterprises should be paranoid about how bulletproof their data architecture is. To standardize on a technology or a company is probably the worst decision you can make today. It is time for best-of-breed solutions that deliver the best of every world.

    Is there a new technology in data or analytics that is creating more challenges than most people realize? How should enterprises adjust their approach to it?

    The best way to kill an organization is to have bad data architecture and use that for decision making. The organizations that are rushing to buy analytics technology should pause to make sure their data strategy is right and confirm that they can make informed decisions with the data they have.

    Enterprises need to look at today's challenging landscape, including the heavy lifting of understanding various data sets, curating them in the right manner, governing the data sets with the right attributes, and identifying the dependability of one source of data over another. A good analytics or BI tool will bubble up both the good and the bad information. Unless you have strong convictions that your data architecture is right, don't go into AI blindly right now.

    What initiative is your organization spending the most time/resources on today? In other words, what internal project(s) is your enterprise focused on so that your company (not your customers) benefit from your own data or business analytics?

    The number one initiative for MemSQL is to have a thriving hybrid platform that is easy to consume both in a self-deployed manner and in the public cloud while also offering companies the utmost ease of use.

    Internally, we've added implementation of our customer data by adding support tickets with adoption and customer success. With our managed services, we've added to the number of users and queries that people can have on our trial in the cloud. Throughout all of this, we have always kept one initiative in mind: how do we help our customers get the most out of our own technology? We're drinking our own data. We've found it extremely useful to make investment decisions, to have interventions at customer sites to make them more successful, and to reach the outcomes that customers partnered with us for.

    Where do you see analytics and data management headed in 2020 and beyond? What's just over the horizon that we haven't heard much about yet?

    Data management will have to get a lot easier to perform and adopt. How do you make such a complex science easy to understand and consume? Analytics has already exploited the ease-of- use side. Analytics will have to get a lot wider in its ability to suck in data and provide a more holistic view of the organization. It must be nimbler to attract more sources of data inputs. Many AI tools are still extremely pricey. We can see a lot of those costs coming down as the user base expands.

    Describe your product/solution and the problem it solves for enterprises.

    The problem that MemSQL solves is how to marry real-time transactional data with historical data to make the best decision in the window of opportunity provided. MemSQL does it at scale, on commodity hardware, and with unprecedented speed and ease of use through a hybrid, multicloud environment.

    Author: James E. Powell

    Source: TDWI

  • The (near) future of data storage

    The (near) future of data storage

    As data proliferates at an exponential rate, companies must not only store it. They must approach Data Management expertly and look to new approaches. Companies that take new and creative approaches to data storage will be able to transform their operations and thrive in the digital economy.

    How should companies approach data storage in the years to come? As we look into our crystal ball, here are important trends in 2020. Companies that want to make the most of data storage should be on top of these developments.

    A data-centric approach to data storage

    Companies today are generating oceans of data, and not all of that data is equally important to their function. Organizations that know this, and know which pieces of data are more critical to their success than others, will be in a position to better manage their storage and better leverage their data.

    Think about it. As organizations deal with a data deluge, they are trying hard to maximize their storage pools. As a result, they can inadvertently end up putting critical data on less critical servers. Doing so is a problem because it typically takes longer to access data on slower, secondary machines. It’s this lack of speed and agility that can have a detrimental impact on businesses’ ability to leverage their data.

    Traditionally organizations have taken a server-based approach to their data backup and recovery deployments. Their priority is to back up their most critical machines rather than focusing on their most business-critical data.

    So, rather than having backup and recovery policies based on the criticality of each server, we will start to see organizations match their most critical servers with their most important data. In essence, the actual content of the data will become more of a decision-driver from a backup point of view.

    The most successful companies in the digital economy will be those that implement storage policies based not on their server hierarchy but on the value of their data.

    The democratization of flash storage

    With the continuing rise of technologies like IoT, artificial intelligence, and 5G, there will be an ever-greater need for high-performance storage. This will lead to the broader acceptance of all-flash storage. The problem, of course, is that flash storage is like a high-performance car: cool and sexy, but the price is out of reach for most.

    And yet traditional disk storage simply isn’t up to the task. Disk drives are like your family’s old minivan: reliable but boring and slow, unable to turn on a dime. But we’re increasingly operating in a highly digital world where data has to be available the instant it’s needed, not the day after. In this world, every company (not just the biggest and wealthiest ones) needs high-performance storage to run their business effectively.

    As the cost of flash storage drops, more storage vendors, are bringing all-flash arrays to the mid-market and more organizations will be able to afford this high-performance solution. This price democratization will ultimately enable every business to benefit from technology.

    The repatriation of cloud data

    Many companies realize that moving to the cloud is not as cost-effective, secure, or scalable as they initially thought. They’re now looking to return at least some of their core data and applications to their on-premises data centers.

    The truth is that data volumes in the cloud have become unwieldy. And organizations are discovering that storing data in the cloud is not only more expensive than they thought but It’s also hard to access that data expeditiously due to the cloud’s inherent latency.

    As a result, it can be more beneficial in terms of cost, security, and performance to move at least some company data back on-premises.

    Now that they realize the cloud is not a panacea, organizations are embracing the notion of cloud data repatriation. They’re increasingly deploying a hybrid infrastructure in which some data and applications remain in the cloud, while more critical data and applications come back home to an on-premises storage infrastructure.

    Immutable storage for businesses of all sizes

    Ransomware will continue to be a scourge to all companies. Because hackers have realized that data stored on network-attached storage devices is extremely valuable, their attacks will become more sophisticated and targeted. This is a serious problem because backup data is typically the last line of defense. Hackers are also attacking unstructured data. The reason is that if the primary and secondary (backup) data is encrypted, businesses will have to pay the ransom if they want their data back. This increases the likelihood that an organization, without a specific and immutable recovery plan in place, will pay a ransom to regain control over its data.

    It is not a question of if, but when, an organization will need to recover from a ‘successful’ ransomware attack. Therefore, it’s more important than ever to protect this data with immutable object storage and continuous data protection. Organizations should look for a storage solution that protects information continuously by taking snapshots as frequently as possible (e.g., every 90 seconds). That way, even when data is overwritten, older objects remain as part of the snapshot: the original data. That way, even when data is overwritten,there always will be another, immutable copy of the original objects that constitute the company’s data that can be instantly recovered… even if it’s hundreds of terabytes.

    Green storage

    Global data centers consume massive amounts of energy, which contributes to global warming. Data centers now eat up around 3% of the world’s electricity supply. They are responsible for approximately two percent of global greenhouse gas emissions. These numbers put the carbon footprint of data centers on par with the entire airline industry.

    Many companies are seeking to reduce their carbon footprint and be good corporate citizens. As part of this effort, they are increasingly looking for more environmentally-friendly storage solutions, those that can deliver the highest levels of performance and capacity at the lowest possible power consumption.

    In 2020, organizations of all sizes will work hard to get the most from the data they create and store. By leveraging these five trends and adopting a modern approach to data storage, organizations can more effectively transform their business and thrive in the digital economy.

    The ‘Prevention Era’ will be overtaken by the ‘Recovery Era’

    Organizations will have to look to more efficient and different ways to protect unstructured and structured data. An essential element to being prepared in the ‘recovery era’ will involve moving unstructured data to immutable object storage with remote replication, which will eliminate the need for traditional backup. The nightly backup will become a thing of the past, replaced by snapshots every 90 seconds. This approach will free up crucial primary storage budget, VMware/Hyper-V storage, and CPU/memory for critical servers.

    While data protection remains crucial, in the data recovery era, the sooner organizations adopt a restore and recover mentality, the better they will be able to benefit from successful business continuity strategies in 2020 and beyond.

    Author: Sean Derrington

    Source: Dataversity

  • The 5 dimensions that help your business with a successful technological transformation

    The 5 dimensions that help your business with a successful technological transformation

    Businesses that have mastered the ability to change quickly share one common denominator: technology is transforming their business. Technology can be a transformative engine that gives your organization the power to learn, adapt and respond at the pace of change.

    Today’s IT leaders have many tools to enable speed and flexibility, including Lean IT, Agile, DevOps and Cloud First among others. However, these concepts alone rarely deliver the technology transformation that organizations need because companies are tempted to think of transformation as a predominately organizational journey. Organizations need to think much more holistically in order to lead a technology transformation and enable a flexible and efficient business.

    There are five essential components, the 5 dimensions, that can lead to a successful technology transformation. Each dimension allows you to learn something unique about your organization, somewhat similar to an archeologist digging through an archeological tell. The 5 dimensions can be used to drive a holistic technology transformation that fits your historical and cultural context.

    Here's a brief look at the 5 dimensions and how they can serve you:

    1. Business alignment 

    Far too many organizations build their technology strategies by aligning with the tactics of their business operations. The result is strategic dissonance, as IT resources are not correctly prioritized to meet strategic business priorities. This misalignment leads to new architectural debt. Today's tech leaders need to understand the organization's business model and build a technology strategy that unlocks and empowers that model, ensuring alignment along the way.

    2. Architectural debt 

    Most organizations suffer from technical debt: systems built for expediency instead of best practices. Architectural debt, on the other hand, is the systemic root cause in the creation of technical debt. A recent survey by IDG and Insight Enterprises found that 64% of executives cited legacy infrastructure and processes as a barrier to IT and digital transformation. ‘Legacy infrastructure and processes’ is just another way of describing architectural debt. Debt is an important concept for technology organizations because it constrains flexibility and results in an IT organization managed by the inertia of their systems. If you want to lead an IT or digital transformation, you must quantify your architectural debt and pay down (minimize) or pay off (eliminate) that debt in order for your transformation to be both successful and sustainable.

    3. Operational maturity 

    IT organizations exist on a spectrum of maturity, classified into three distinct phases: operators, automators, and innovators. Operational maturity is a critical enabler of an organization’s ability to execute their vision or goals. There is a high correlation between business value and operational maturity. Mature IT organizations are focused on high quality, business value-added activities. An IT organization’s capabilities directly correlate with its phase of maturity along our spectrum. You must look at the people, processes, technologies and artifacts to understand where change must occur in order to increase operational maturity.

    4. Data maturity

    Clive Humby, U.K. mathematician and architect of Tesco's clubcard, famously said in 2006 that 'Data is the new oil… It’s valuable, but if unrefined it cannot really be used'. Nearly a decade later, The Economist called data the world’s most valuable resource. Many organizations are sitting on mountains of unrefined data, uncertain how they should be storing, processing or utilizing that valuable resource. Top-performing organizations that are using data to drive their business and technology decisions have a distinct competitive advantagetoday and tomorrow.

    5. Organizational dexterity 

    Your organization’s capacity for innovation and change directly correlates with its dexterity. To quote Peter Drucker: 'In times of turbulence, the biggest danger is to act with yesterday’s logic'. Organizations falter when they have institutionalized a culture of yesterday’s logic. An agile organization isn’t just a decentralized organization, it’s an organization that has the capability to learn and unlearn, demonstrates complex problem solving, emotional intelligence and much more.

    We live and work in turbulent times, with more volatility on the horizon. Is your technology ready? How about your organization? The 5 dimensions play a critical role in building a holistic understanding of your organization. Seeing the whole picture enables you to build a pragmatic path forward that leads to a true technology transformation.

    Author: Alex Shegda

    Source: Information-management

  • The 6 abilities of the perfect data scientist  

    The 6 abilities of the perfect data scientist

    There are currently over 400K job openings for data scientists on LinkedIn in the U.S. alone. And, every single one of these companies wants to hire that magical unicorn of a data scientist that can do it all.

    What rare skill set should they be looking for? Conor Jensen, RVP of AI Strategy at AI and data analytics provider Dataiku, has boiled it all down to the following: 

    1. Communicates Effectively to Business Users: To let the data tell a story, a data scientist needs to be able to take complex statistics and convey the results persuasively to any audience.

    2. Knows Your Business: A data scientist needs to have an overall understanding of the key challenges in your industry, and consequently, your business. 

    3. Understands Statistical Phenomena: Data scientists must be able to correctly interpret statistics: is a result representative or not? This skill is key, since the majority of stats that are analyzed contain statistical bias that needs correcting.

    4. Makes Efficient Predictions: The data scientist must have a broad knowledge of algorithms to select the right one, which features to adjust to best feed the mode, and how to combine complimentative data. 

    5. Provides Production-Ready Solutions: Today’s data scientists need to provide services that can run daily, on live data. 

    6. Can Work On A Mass Scale: A data scientist must know how to handle multi-terabyte datasets to build a robust model that holds up in production. In practice this means that they need to have a good idea of computation time, what can be done in memory and what requires Hadoop and MapReduce.

    Source: Insidebigdata

  • The benefits of continuous market monitoring

    The benefits of continuous market monitoring

    Market intelligence studies are often event-driven and have a clear begin and end. Usually, there is a decision that needs to be substantiated by specific market insights. As a consequence, managers start running to collect the required data or enable their research partner. After the decision has been made, activities are dominated by daily routine again. After some time, this cycle repeats itself, triggered by a new question that needs to be validated or substantiated.

    Considering the latter, would it not make way more sense to track your market on a more continuous base?

    For several reasons we think it does!

    Continuously monitoring your market provides actionable insights that are highly valuable in order to make decisions on marketing, strategic planning, positioning and product development. Considering your products and brands as key assets, knowing how they are used, perceived and recognized by your clients is crucial to steer future company’s growth and development.

    You can measure your brand and product by focusing on:

    • General performance
    • Awareness (aided & spontaneous)
    • Usage (penetration rates & market shares)
    • Customer satisfaction
    • Price position

    Of course, these parameters above are suggestions, the possibilities are endless. Structural market monitoring offers many benefits, we will highlight three of them:  

    1. Always access to the right market information

    Imagine you are asked to contribute to a new strategic plan for your department; you’ll definitely need figures about market size, potentials, trends & drivers, shares and competitor performance. Also, you need those figures as soon as possible. Obtaining this information can be extremely time consuming, especially if you have to start from scratch. By continuously monitoring your market you ensure yourself of direct access to relevant market information based on market definitions made common in your organization.

    It does not matter if you need to provide information for a strategy plan, marketing campaign or product development session; all information is directly available at your fingertips.

    2. Track market developments overtime

    By monitoring your market on a continues base, you are able to recognize long term trends.

    • Is your product or brand gaining more awareness?
    • Is this also resulting in growing sales figures or they lacking behind?
    • Is the new marketing campaign paying off?

    By defining clear KPI’s for your product or brand you can draw the bigger picture to track developments over time.

    Also, you can compare the impact of actions over time and draw comparisons. One organization may benefit from surveying customers yearly, another option is to conduct monthly or quarterly research. It all depends on your audience, goals, and business objectives. By market monitoring you are able to measure performance over time.

    3. Be on top of your market

    ‍We all know: daily operations are time consuming, research shows that most business leaders have the constant feeling they are missing out on key market developments. By continuously monitoring your market, you stay on top of developments, drivers and new competitors. This is an absolute necessity in today’s fast-paced world.

    Stop using general assumptions to drive strategic decision making, but use data and fact-based evidence.

    Key takeaway

    You can measure your product or brand on a wide variety of aspects, depending on the objectives of your organization. Surveying customers on a regular basis (for instance) provides your organization with actionable insights, that can steer marketing, strategic planning, positioning and product development decisions. Additionally, continuous market monitoring provides you with direct access to the right information, be able to track developments over time and stay on top of your market.

    Source: Hammer Market Intelligence

  • The data management issue in the development of the self-driving car

    The data management issue in the development of the self-driving car

    Self-driving cars and trucks once seemed like a staple of science fiction which could never morph into a reality here in the real world. Nevertheless, the past few years have given rise to a number of impressive innovations in the field of autonomous vehicles that have turned self-driving cars from a funny idea into a marketing gimmick and finally into a full-fledged reality of the modern roadway. However, a number of pressing issues are still holding these autonomous vehicles back from full-scale production and widespread societal embrace. Chief amongst them is the data management challenge wrought by self-driving vehicles.

    How should companies approach the dizzying data maze of autonomous vehicles? Here’s how to solve the data management of self-driving cars, and what leading automotive companies are already doing.

    Uber and Lyft want to release self-driving cars on the public

    Perhaps the most notable development in the creation of autonomous vehicles over the past few years has been that Uber and Lyft have both recently announced that they’re interested in releasing self-driving cars to the general public. In other words, these companies want autonomous vehicles that are navigating complex city environments by themselves and without the assistance of a human driver who can take over in the event of an emergency.

    Uber has already spent a whopping $1 billion on driverless cars, perhaps because the ridesharing app relies heavily on a workforce of freelancers who aren’t technically considered full-time employees. It could be that Uber and other companies see a financial imperative in automating their future workforce so that they don’t have to fret about providing insurance and other benefits to a large coterie of human employees. Whatever the company’s motivations, Uber has clearly established itself as a leader in the self-driving car space where investments are concerned and will continue to be a major player for the foreseeable future.

    Other companies like Ford may have the right idea, as they’re moving in the opposite direction of Uber and trying to take things slowly when debuting their autonomous vehicles. This is because Ford believes that solving the data management challenge of self-driving cars takes time and caution more than it does heavy spending and ceaseless innovation. Ford’s approach opposite to Uber's approach to self-driving cars could pay off too, as the company has avoided the disastrous headlines that have followed Uber everywhere when it comes to testing and general brand PR.

    We can learn from Ford in one regard: haste. Though important when delivering a product to market, it often results in shoddy production that leads to costly mistakes. The company is deciding to take things slow when it comes to collecting and managing data from auto insurance companies, which is a standard others should be following if they don’t want to get in over their heads. Ford’s focus on creating data 'black boxes' not dissimilar to those on airplanes, which can be consulted in the event of a major crash or incident for a data log of what occurred, is going to become a standard feature of autonomous vehicles before long.

    It’s a matter of trust

    It’s going to become increasingly obvious over the next few years that solving the data management challenges wrought by the advent of self-driving cars is going to be a matter of trust. Drivers need to be certain that their cars aren’t acting as surveillance devices, as does society broadly speaking, and manufacturers need to be taking steps to build and strengthen trust between those who make the car, those whose data the car collects, and those who analyze and utilize such data for commercial gains.

    The fierce competition between Tesla and Waymo is worth watching in this regard, largely because the profit incentives of the capitalist marketplace will almost assuredly lead both of these companies to throw caution to the wind in their race to beat one another via self-driving cars. We will only be able to solve the data management challenge issued by autonomous vehicles if we learn that sometimes competition needs to be put aside in the name of cooperation that can solve public health crises like deaths resulting from self-driving vehicles.

    The data management challenge posed by self-driving cars demands that that auto and insurance industries also take ethics into consideration to a hitherto undreamt-of extent. Modern vehicles are becoming surveillance hubs in and of themselves, with Tesla’s newest non-lidar approach to self-driving car data collection proving to be more accurate, and thus necessarily more invasive, than nearly any other technique that’s yet been pioneered. While this may help Tesla in the sense that it’s propelling the company ahead of its adversaries technologically speaking, it poses immense ethical questions like what the responsibility of the market leader is when it comes to fostering innovations which necessarily surveil the public in order to function.

    It’s a self-driving world now

    The data management challenges being generated by the ceaseless advance of self-driving vehicles won’t go away anytime soon, as we’re now in a self-driving world where automation, data collection (another term for surveillance), and programmatic decision-making is the new standard. While we’ve grown so used to always being the one doing the driving, humans are now being put in the backseat and must trust in the capacity of machines to deliver us to a brighter future. In order to arrive at our destination unimpeded, we need a new focus on ethics across the automotive and insurance industries that will ensure this new technology is primarily used for good.

    Additional regulation will be needed in order to protect the privacy of everyday people, and modern infrastructure must be constructed in order to alleviate the sensory-burden being placed on autonomous vehicles if they’re to succeed in the long-term. The good news for those who love self-driving cars is that the profit incentive is enough to make companies plow ahead regardless of the data management challenges they’re facing. This could result in huge ethical dilemmas later on, though, so those interested in self-driving cars can’t allow humans to become unmoored from the driver’s seat if we want our values to be represented on the roads of tomorrow.

    Author: Steve Jones

    Source: SmartDataCollective

  • The evolution of the Data Executive

    The evolution of the Data Executive

    Twenty years ago, no one had heard of a chief data officer. Now the position is central to C-suites in a majority of major companies (65% in NewVantage Partners’ 2021 survey). The rise in prominence of data executives goes hand in hand with the rise of the importance of data in the modern business world: Every company must become truly data-driven, and data executives play an outsized role in making that happen.

    That being said, odds are if you’re an executive reading this, you’re thinking about adding a data executive of some kind to your team. (Or maybe you want to be one yourself some day!) But what are data officers, and what do they do? How can they deliver value to their organizations — and what does that mean for you?

    “It’s complicated…” the evolving role of the data executive

    The role data plays in the fate of every company is constantly evolving, as is the role of the data executive. Similarly, the duties of this vital C-suite player are not fully agreed on yet, even by those who hold the title. Further complicating matters is a group of similarly named leaders (including CIO, chief analytics officer, and vice president of analytics) that likely have different domains. Regardless of what they’re called, in an organization that also has an information or analytics executive, collaboration is crucial.

    When the position first formed nearly two decades ago, it was mainly to focus on playing defense with data — keeping it secure — from external threats. A lot of those remnants of the past remain in the position, but as the value of data has soared, a data executive’s success is increasingly tied to business goals. Some of these lofty objectives include defining a robust data strategy which historically focused on:

    • Operation: Making sure data is available to users who need it, in a format that’s usable and easy to access
    • Incremental improvement: Guiding the organization to data-informed ways to decrease costs and gain efficiencies
    • Analytics and data science: Providing necessary aids so the organization can gain intelligence from data gleaned internally and externally
    • Governance and security: Ensuring enterprise data is maintained, secured, and handled safely and deliberately

    The changing demands placed on data executives 

    Although data executives’ focus areas are still developing, organizations increasingly expect them to take the lead in some of these key ways:

    • Value creation. Combining data, domain expertise, and an analytics platform opens up opportunities for “new revenue for your company and a ton of new value for your existing customers,” according to Sisense Managing Director of Data Monetization and Strategy Consulting Charles Holive. Seventy percent of CDOs are charged with revenue generation as first priority, and 14% get compensated based on revenue gains.
    • Data quality, availability, and security. CDOs work to ensure data across the organization is clean and correct. Moreover, they balance the competing demands of data security, access, and quality across sources and through subordinate organizations.
    • Data-driven culture. Even though data’s relevance has become obvious, efforts to create a data-driven culture have proven ineffective overall. CDOs are now instrumental in guiding departments to infuse intelligence into workflows, so employees engage with data automatically at key decision-making points. In addition, CDOs lead the charge to educate employees on how to use data, though 61% recognize a skill-set gap still remains.

    Developing the modern data strategy

    Data strategies vary from organization to organization, but across industries they typically contain components such as:

    • A strong data management vision: “What do we want data to do for us?”
    • A deliberate tie to business objectives, which can then drive short-term and long-term data goals and tactics
    • Metrics to gauge success, allowing the executive and stakeholders to detect (and replicate) victories, as well as curtail experiments that aren’t yielding worthwhile results
    • Overarching guidance for how leadership sees data’s role in business, including clear ethical lines

    The data executive plays an essential role in crafting this data strategy. In fact, today’s fluid environment forms a perfect opportunity to redefine it in light of new technological, cultural, and business needs. 

    The data strategy of the future will be formed by three imperatives:

    1.  Provide new value to the organization. When data leaders understand stakeholder concerns across the business, they can help them leverage the power of data in new and exciting ways. That will transform organizations from the inside out, with the result that data becomes a differentiator (or even a revenue stream) with customers and partners.

    2. Rethink data-driven culture. Instead of trying to force a data-driven culture, change the way you think about it with next-gen analytics. Use AI to bridge skill set gaps, and leverage low-code tools to minimize talent shortages. In the end, make data use easy by bringing it to the people, rather than requiring action from them. 

    3. Ensure streamlined data processes and consolidate when needed. To minimize tech creep or misinformation within the organization, build a robust, trustworthy system that lends itself to automating data access, delivery of insights, and more.

    The data executive of the future

    The role of the data executive is certain to keep evolving in coming years. As more companies come to embrace the power of data, CDOs will naturally move beyond their current responsibilities. When employees find data-derived intelligence infused throughout their workday, the data officer’s role will have taken a major turn. 

    This much is certain, though: The CDO of the future will continue to meet the challenge of how best to apply the data at hand — whatever it is — to the organization’s short- and long-term business goals.

    Author: Mindi Grissom

    Source: Sisense

  • The future of AI and the key of human interaction

    The future of AI and the key of human interaction

    Artificial intelligence technology is evolving at a faster pace than ever, largely due to human powered data.

    Artificial Intelligence (AI) has significantly altered how work is done. However, AI even has a bigger impact by enhancing human capabilities. Research conducted by the Harvard Business Review found that the interaction between machines and humans significantly improves firms’ performance.

    Successful collaboration between humans and machines enhances each other’s strengths, including teamwork, leadership, creativity, speed, scalability, and quantitative capabilities.

    How Humans Collaborate with Machines

    For the successful collaboration between machines and humans, humans are required to carry out three crucial roles:

    • Training the machines to carry out specific roles.
    • Explaining the outcomes of those tasks.
    • Sustaining the responsible use of machines.

    Human labeling and data labeling are however important aspects of the AI function as they help to identify and convert raw data into a more meaningful form for AI and machine learning to learn.

    Artificial Intelligence, in turn, needs to process data to make conclusions.

    AI also needs continuous process monitoring to ensure that errors are tracked, and there is efficiency. For instance, although an autonomous vehicle can drive independently, it may not register surroundings like a human driver. Therefore, safety engineers are needed to track these cars’ movement and alert systems if the vehicles pose a danger to humans or buildings.

    More and more business owners are adopting AI and other machine learning technologies to automate their decision-making processes and also help them uncover new business opportunities. However, using AI to automate business processes is not easy. Businesses use data labeling that allows AI systems to understand the environments and conditions in the real world accurately.

    Human involvement in AI is possible through human labeling. This massive undertaking requires input from groups of people to help correctly identify objects, including digitization of data, Natural Language Processing, Data Tagging, Video Annotation, and Image Processing.

    How Artificial Intelligence is Impacting Data Quality

    1.      Elimination of Human Mistakes

    Many believe that AI will replace human intelligence, which is not far from the truth. Artificial intelligence has the potential to combat human error by taking up the tasking responsibilities associated with the analysis, drilling, and dissection of large volumes of data.

    Data quality is crucial in the age of artificial intelligence. The quality of data encompasses a wide range of factors, including accuracy, completeness, uniformity, and authenticity. However, analyzing heterogeneous data and then interpreting it into one or more structures has been challenging. The biggest challenge has remained the early detection of data issues which also remains unknown to the data owners.

    Before AI, the human factor was necessary for typing data. Therefore, errors were rampant, and specific data quality was impossible to achieve. Fortunately, AI eliminates the human factor, therefore significantly improving data quality.

    2.      Faster and Better Learning

    Although the primary goal of AI is to enhance data quality, not all data collected is of high quality. However, Ai uses algorithms that can screen and handle large data sets. Even with these technologies, systemic prejudices are unavoidable. Therefore, algorithm testing and training on data quality are necessary.

    3.      Enhances the Identification of Data Trends to Aid Decision Making

    AI and Machine Learning ensure that data trends are identified. The domain experience explains data patterns to be used in commercial decision-making. In addition, the domain is also responsible for identifying unexpected data patterns to avoid the loss of legitimate data. Also, it ensures that invalid data does not influence the outcome.

    4.      AI and Machine Learning Enhance Data Storage

    Information and training are also lost when a data storage device is lost. However, Artificial Intelligence continues to progress and will help collect and store helpful information over time.

    5.      Assessment of Data Types for Quality

    While different metrics can be used to determine data quality, accuracy is the primary focus since it is easy to change for different data sets and concerns for decision-makers. Data quality is crucial in Artificial Intelligence and automated decision-making. Assessing the accuracy of data requires the identification of data types to determine their accuracy. This requires the identification, interpretation, and documentation of data sources.


    The digital transformation is on, and many businesses are jumping onto the AI and machine learning bandwagon. This has resulted in larger, more sophisticated data streams, posing challenges to data quality. It is only reasonable for companies to invest in AI and machine learning as they provide data safety, protection, and collection tools.

    However, the move toward AI and machine learning will require the involvement of the human factor trained in AI algorithm programming. AI will be directed toward different fields, including robotics, automated scheduling and learning, general intelligence, and computer vision. For these fields to mature, there will be a need to generate and access massive amounts of data.

    The collected data will have to be broken down into a format easily recognizable by the AI systems. As AI enhances task automation, big data will continue to grow larger. Even as more data for analysis and learning is available, AI will continue to grow.

    If your company has not yet invested in AI and machine learning, then it is time. However, you need to understand that machines cannot work independently and that you need to invest in experts who will work collaboratively to ensure data quality.

    Author: Ryan Kh

    Source: Smart Data Collective

  • The persuasive power of data and the importance of data integrity

    The persuasive power of data and the importance of data integrity

    Data is like statistics: a matter of interpretation. The process may look scientific, but that does not mean the result is credible or reliable.

    • How can we trust what a person says if we deny the legitimacy of what he believes?
    • How can we know a theory is right if its rationale is wrong?
    • How can we prove an assertion is sound if its basis is not only unsound but unjust?

    To ask questions like these is to remember that data is neutral, it is an abstraction, whose application is more vulnerable to nefarious ends than noble deeds; that human nature is replete with examples of discrimination, tribalism, bias, and groupthink; that it is not unnatural for confirmation bias to prevail at the expense of logic; that all humanity is subject to instances of pride, envy, fear, and illogic.

    What we should fear is not data, but ourselves. We should fear the misuse of data to damn a person or ruin a group of people. We should fear our failure to heed Richard Feynman’s first principle about not fooling ourselves. We should fear, in short, the corruption of data; the contemptible abuse of data by all manner of people, who give pseudoscience the veneer of respectability.

    Nowhere is the possibility of abuse more destructive, nowhere is the potential for abuse more deadly, nowhere is the possible, deliberate misreading of data more probable than in our judicial system.

    I write these words from experience, as both a scientist by training and an expert witness by way of my testimony in civil trials.

    What I know is this: Data has the power to persuade.

    People who use data, namely lawyers, have the power to persuade; they have the power to enter data into the record, arguing that what is on the record, that what a stenographer records in a transcript, that what jurors read from the record is dispositive.

    According to Wayne R. Cohen, a professor at The George Washington University School of Law and a Washington, DC injury claims attorney, data depends on context.

    Which is to say data is the product of the way people gather, interpret, and apply it.

    Unless a witness volunteers information, or divulges it during cross-examination, a jury may not know what that witness’s data excludes: exculpatory evidence, acts of omission, that reveals the accused is not guilty, that the case against the accused lacks sufficient proof, that the case sows doubt instead of stamping it out.

    That scenario should compel us to be more scrupulous about data.

    That scenario should compel us to check (and double-check) data, not because we should refuse to accept data, but because we must not accept what we refuse to check.

    That scenario summons us to learn more about data, so we may not have to risk everything, so we may not have to jeopardize our judgment, by speculating about what may be in lieu of what is.

    That scenario is why we must be vigilant about the integrity of data, making it unimpeachable and unassailable.

    May that scenario influence our actions.

    Author: Michael Shaw

    Source: Dataversity

  • The striking similarities between presenting analytics and telling jokes

    The striking similarities between presenting analytics and telling jokes

    Everyone is familiar with the age-old adage that if you must explain a joke after you tell it, then the joke will be a flop. The same principle is true when you put data in front of a live audience, whether with a table, a graph, or a chart. This blog will clarify what seems at-first an unlikely comedic connection.

    The link between comedy and analytics

    No matter how funny a joke may be, it will not be funny if someone does not immediately understand what it is that makes the joke funny. Once explained, the person may logically understand why the joke is funny, but they will not experience the humor in the same way they would have if they had gotten the joke on their own. Somehow, the humor is only truly felt if you “get” the joke both immediately and on your own.

    I often ask my audiences how they feel charts and graphs are like jokes when I am discussing this topic during a session. Over time, I have received several good answers beyond the one I am looking for when I ask. Some of the legitimate ways that audience members have tied jokes to charts and graphs include:

    a) Most are bad

    b) Few people are good at delivering them

    c) The best ones are simple

    d) Context can heavily influence audience reception

    e) If you have to explain it, you’ve failed

    All of those are true, but for this blog, we are going to focus on answer e). Just as you’ve failed in your humor if you have to explain your joke, you’ve failed in your analytics presentation if you have to provide an explanation for your charts and graphs.

    Why simplicity matters

    Whatever format your data is presented in, it is important that it is easy for your audience to comprehend the core information and the point you are making about that information very quickly and with limited effort. If you achieve this, then the audience will remain focused on the narrative and context that you provide to support the chart or graph. This is important for several reasons:

    1. When you are presenting, you want people listening to you and the story you are telling. You do not want them struggling to understand the data projected on the screen
    2. The more an audience struggles to understand what you are showing them, the more they lose interest and the lower your credibility goes
    3. People trust experts that they understand. Want to be trusted? Then be understood!
    4. People walk away impressed and thinking highly of a presentation if the information provided was clear and easy to comprehend
    5. Technical experts have a reputation for being hard to understand, so if you can surprise the audience by making things simple, you will have a win

    As you develop a presentation, always force yourself to look at what you have drafted through the eyes of the audience it is intended for. What may seem obvious and simplistic to you as an expert may not be perceived the same way by an audience that lacks your expertise and experience. You are used to looking at complex measures and comparing them on the fly. Your audience won’t be as comfortable with that as you are and will need to have information provided at a level that they can easily absorb.

    Make your audience want to attend another show

    People will not go see a comedian a second time if many of the comedian’s jokes are hard to understand because that takes the fun out of the show. Similarly, if you spend a lot of time explaining your charts and graphs, the audience will not be inclined to come to another presentation of yours (at least not happily).

    With a little effort and attention, you can create a presentation that includes compelling and effective charts and graphs while also enabling your audience to easily follow along. To do this you must always remember that as with a joke, if you must explain a chart or graph for people to get your point, then you have failed.

    Author: Bill Franks

    Source: Datafloq

  • The top 10 benefits of Business Intelligence reporting

    The top 10 benefits of Business Intelligence reporting

    Big data plays a crucial role in online data analysis, business information, and intelligent reporting. Companies must adjust to the ambiguity of data, and act accordingly. Spreadsheets no longer provide adequate solutions for a serious company looking to accurately analyze and utilize all the business information gathered.

    That’s where business intelligence reporting comes into play and, indeed, is proving pivotal in empowering organizations to collect data effectively and transform insight into action.

    So, what is BI reporting advancing in a business? It provides the possibility to create smart reports with the help of modern BI reporting tools, and develop a comprehensive intelligent reporting practice. As a result, BI can benefit the overall evolution as well as the profitability of a company, regardless of niche or industry.

    To put the business-boosting benefits of BI into perspective, we’ll explore the benefits of business intelligence reports, core BI characteristics, and the fundamental functions companies can leverage to get ahead of the competition while remaining on the top of their game in today’s increasingly competitive digital market.

    Let’s get started by asking the question 'What is business intelligence reporting?'

    What is BI reporting?

    Business intelligence reporting, or BI reporting, is the process of gathering data by utilizing different software and tools to extract relevant insights. Ultimately, it provides suggestions and observations about business trends, empowering decision-makers to act.

    Online business intelligence and reporting are closely connected. If you gather data, you need to analyze and report on it, no matter which industry or sector you operate in.

    Consequently, you can develop a more strategic approach to your business decisions and gather insights that would have otherwise remain overlooked. But let’s see in more detail what the benefits of these kinds of reporting practices are, and how businesses, whether small or enterprises, can develop profitable results.

    Benefits of business intelligence and reporting

    There are a number of advantages a company can gain if they approach their reporting correctly and strategically. The main goal of BI reports is to deliver comprehensive data that can be easily accessed, interpreted, and provide actionable insights.

    Let’s see what the crucial benefits are:

    1. Increasing the workflow speed

    Managers, employees, and important stakeholders often can be stuck by waiting for a comprehensive BI report from the IT department or SQL developers. Especially if a company connects its data from different data sources. The process can take days, which slows down the workflow. Decisions cannot be made, analysis cannot be done, and the whole company is affected.

    Centralizing all the data sources into a single place, with data connectors that can provide one point of access for all non-technical users in a company, is one of the main benefits a company can have. The data-driven world doesn’t have to be overwhelming, and with the right BI tools, the entire process can be easily managed with a few clicks.

    One additional element to consider is visualizing data. Since humans process visual information 60.000 times faster than text, the workflow can be significantly increased by utilizing smart intelligence in the form of interactive, and real-time visual data. Each information can be gathered into a single, live dashboard, that will ultimately secure a fast, clear, simple, and effective workflow. This kind of report will become visual, easily accessed, and steadfast in gathering insights.

    2. Implementation in any industry or department

    Creating a comprehensive BI report can be a daunting task for any department, employee or manager. The goals of writing successful, smart reports include cost reduction and improvement of efficiency. One business report example can focus on finance, another on sales, the third on marketing. It depends on the specific needs of a company or department.

    For example, a sales report can act as a navigational aid to keep the sales team on the right track.

    A sales performance dashboard can give you a complete overview of sales targets and insights on whether the team is completing their individual objectives. Of course, the main goal is to increase customers’ lifetime value while decreasing acquisition costs. 

    Financial analytics can be kept under control with its numerous features that can remove complexities and establish a healthy and holistic overview of all the financial information a company manages.

    It doesn’t stop here. Another business intelligence report sample can be applied to logistics, one of the sectors that can make the most out of business intelligence and analytics, therefore, easily track shipments, returns, sizes or weights, just to name a few.

    Enhancing the recruitment process with HR analytics tools can bring dynamic data under the umbrella of BI reporting, making feedbacks, interviews, applicants’ experience and staffing analysis easier to process and derive solutions. 

    3. Utilization of real-time and historical data

    With traditional means of reporting, it is difficult to utilize and comprehend the vast amount of gathered data. Creating a simple presentation out of voluminous information can challenge even the most experienced managers. Reporting in business intelligence is a seamless process since historical data is also provided within an online reporting tool that can process and generate all the business information needed. Artificial intelligence and machine learning algorithms used in those kinds of tools can foresee future values, identify patterns and trends, and automate data alerts.

    Another crucial factor to consider is the possibility to utilize real-time data. The amount of sophistication that reporting in BI projects can achieve cannot be compared with the traditional ones. A report written as a word document will not provide the same amount of information and benefit as real-time data analysis, with implemented alarms that can forewarn about any business anomaly, and that kind of support software will consequently increase business efficiency and decrease costs. It is not crucial to establish a whole department to manage and implement this process, numerous presentation software can help on the way.

    4. Customer analysis and behavioral prediction

    There is no company in the world which doesn’t concentrate on their customers. They are ultimately the ones that provide revenue and define if a business will survive the market.

    Customers have also become more selective towards buying and deciding which brand should they trust. They prefer brands “who can resonate between perceptual product and self-psychological needs.” If you can tackle into their emotional needs, and predict their behavior, you will stimulate purchase and provide a smooth customer experience. BI reports can combine those resources and provide a stimulating user experience. The key is to gather information and adjust to user needs and business goals, as shown in the picture below.

    Today there are numerous ways in which a customer can interact with a specific company. Chatbots, social media, emails, or direct interaction; the possibilities are endless.

    The increment of these kinds of engagement has increased the number of communication touchpoints and, consequently, sources of data. All of the information gathered can provide a holistic overview of the customer, evaluate why a certain strategy worked or failed, connect the cause and effect of customer service reports, and, thus, improve business operations.

    5. Operational optimization and forecasting

    Every serious business uses key performance indicators to measure and evaluate success. There are countless KPI examples to select and adopt in a strategy, but only the right tracking and analysis can bring profitable results. Business intelligence and reporting are not just focused on the tracking part, but include forecasting based on predictive analytics and artificial intelligence that can easily help avoid making a costly and time-consuming business decision. Reporting in business intelligence is, therefore, highlighted from multiple angles that can provide insights that can otherwise stay overlooked.

    6. Cost optimization

    Another important factor to consider is cost optimization. As every business needs to seriously consider their expenses and ROI (return on investment), often the costs and savings are hardly measured. In a business reporting software, you have access to evident data that can be easily calculated by small businesses and large enterprises with just a few clicks.

    7. Informed strategic decision-making

    Whether you’re a CEO, an executive, or managing a small team, with great power comes great responsibility. As someone with corporate seniority, you will need to formulate crucial strategies and make important choices that have a significant impact on the business. Naturally, decisions and initiatives of this magnitude aren’t to be taken lightly. That’s where reporting business intelligence tools come in.

    Concerning senior decision-making or strategy formulation, it’s essential to use digital data to your advantage to guide you through the process. BI reporting dashboards are intuitive, visual, and provide a wealth of relevant data, allowing you to spot trends, identify potential strengths or weaknesses, and uncover groundbreaking insights with ease.

    Whether you need to streamline your budget, put together a targeted marketing campaign, improve an internal process, or anything else you can think of, leveraging BI will give you the ability to make swift, informed decisions and set actionable milestones or benchmarks based on solid information.

    The customizable nature of modern data analytic stools means that it’s possible to create dashboards that suit your exact needs, goals, and preferences, improving the senior decision-making process significantly.

    8. Streamlined procurement processes

    One of the key benefits of BI-based reports is that if they’re arranged in a digestible format, they offer access to logical patterns and insights that will allow you to make key areas of your business more efficient. This is particularly true if you deal in a high turnover of goods or services. And if this is the case, it’s more than likely that you have some form of a procurement department.

    Your procurement processes are vital to the overall success and sustainability of your business, as its functionality will filter down through every core facet of the organization. Business intelligence reporting will help you streamline your procurement strategy by offering clear-cut visualizations based on all key functions within the department.

    Working with interactive dashboards will empower you to summarize your procurement department’s activities with confidence, which, in turn, will help you catalyze your success while building brand awareness. In the digital age, brand awareness is priceless to the continual growth of your organization.

    Another undeniable benefit of BI in the modern age.

    9. Enhanced data quality

    One of the most clear-cut and powerful benefits of data intelligence for business is the fact that it empowers the user to squeeze every last drop of value from their data.

    In a digital business landscape where new data is created at a rapid rate, understanding which insights and metrics hold real value is a minefield. With so much information and such little time, intelligent data analytics can seem like an impossible feat.

    We’ve touched on this subject throughout this post, but enhanced data quality is such a powerful benefit that it’s worth exploring in its own right. To put this notion into a practical perspective, it’s important to consider the core features and functions of modern BI dashboards:

    • Non-restricted data access: Typically, cutting-edge data intelligence dashboards are accessible across a broad range of mobile devices for non-restricted 24/7 access to essential trends, metrics, and insights. This makes it possible to make informed data-driven decisions anytime, anywhere, increasing productivity in the process.
    • Purity: As modern BI tools operate using highly-visual and focused KPIs, you can take charge of your data, ensuring that the metrics you’re served are 100% relevant to the ongoing success of your business. These intuitive tools work as incredibly effective data curation and filtration systems. As a result, your decisions will be accurate, and you will never waste time on redundant data again.
    • Organizational inclusion: The accessible, seamless functionality of BI tools means that you don’t have to be technically-minded to reap the rewards of data intelligence. As it’s possible to customize each dashboard to the specific needs of your user with ease and extract meaningful insights from a wealth of dynamic KPIs, everyone within the organization can improve their direct performance with data analytics, something that will benefit the entire organization enormously. Today’s dashboards are inclusive and improve the overall value of your organization’s data.
    • Data storytelling capabilities: Our brains are wired to absorb compelling narratives. If you’re able to tell an inspiring, relevant story with your data, you can deliver vital information in a way that resonates with your audience, whether it’s employees or external stakeholders. Intelligence dashboards make data storytelling widely accessible. 

    10. Human resources and employee performance management

    Last but certainly not least in our definitive rubdown of BI benefits, we’re going to consider how BI-centric reports can assist performance management.

    By gaining centralized access to performance-based KPIs, it’s easy to identify trends in productivity, compare relevant metrics, and hone in on the individual performance. In doing so, you can catalyze the success of your business in a big way. To put this into perspective, we’re going to look at human resources and employee performance management.

    In many ways, your employees are the lifeblood of your entire organization. If the talent within your organization is suffering, your business will, too. Keeping your staff engaged and motivated is vital.

    Role or department aside, if your employees are invested in their work, each other, and the core company mission, your business will continue to thrive. But how can reporting business intelligence software help with employee engagement and motivation?

    By gaining access to dynamic visual data based on the individual as well as collective employee performance, it’s possible to offer training as well as support to your staff where needed, while implementing leader boards to inspire everyone to work to the best of their abilities.

    Offering your employees tailored support and growth opportunities, showing that you care, and offering incentives will help you increase motivation exponentially. As a primary duty of the modern human resources department, having the insights to manage internal talent at your disposal is crucial. 

    The ability to interact with focused employee data will empower you to create strategies that boost performance, employee satisfaction, and internal cohesion in a way that gives you an all-important edge on the competition.

    Improved internal communication plays a pivotal role in employee performance and motivation. Find out how big screen dashboards can help improve departmental cohesion with our definitive guide to office dashboards.

     'Data that is loved tends to survive'. – Kurt Bollacker, a renowned computer scientist.

    Reporting in business intelligence: the future of a sustainable company

    Collecting data in today’s digitally-driven world is important, but analyzing it to its optimum capacity is even more crucial if a business wants to enjoy sustainable success in the face of constant change.

    Reporting and business intelligence play a crucial role in obtaining underlying figures to explain decisions and present data in a way that offers direct benefits to the business. As we mentioned earlier, there is no industry that isn’t currently affected by the importance of data and analysis. We have only scratched the surface with our top benefits which any company can take advantage of and bring positive business results.

    In this bold new world of data intelligence, businesses of all sizes can use BI tools to transform insight into action and push themselves ahead of the pack, becoming leaders in their field.

    Spotting business issues, with a BI solution that provides detailed business intelligence reports, can only create space for future development, cost reduction, and comprehensive analysis of the strategic and operational state of a company.

    Author: Sandra Durcevic

    Source: Datapine

  • The transformation of raw data into actionable insights in 5 steps

    The transformation of raw data into actionable insights in 5 steps

    We live in a world of data: there’s more of it than ever before, in a ceaselessly expanding array of forms and locations. Dealing with Data is your window into the ways organizations tackle the challenges of this new world to help their companies and their customers thrive.

    In a world of proliferating data, every company is becoming a data company. The route to future success is increasingly dependent on effectively gathering, managing, and analyzing your data to reveal insights that you’ll use to make smarter decisions. Doing this will require rethinking how you handle data, learn from it, and how data fits in your digital transformation.

    Simplifying digital transformation

    The growing amount and increasingly varied sources of data that every organization generates make digital transformation a daunting prospect. But it doesn’t need to be. At Sisense, we’re dedicated to making this complex task simple, putting power in the hands of the builders of business data and strategy, and providing insights for everyone. The launch of the Google Sheets analytics template illustrates this.

    Understanding how data becomes insights

    A big barrier to analytics success has been that typically only experts in the data field (data engineers, scientists, analysts and developers) understood this complex topic. As access to and use of data has now expanded to business team members and others, it’s more important than ever that everyone can appreciate what happens to data as it goes through the BI and analytics process. 

    Your definitive guide to data and analytics processes

    The following guide shows how raw data becomes actionable insights in 5 steps. It will navigate you through every consideration you might need to make about what BI and analytics capabilities you need, and every step of the way that leads to potentially game-changing decisions for you and your company.

    1. Generating and storing data in its raw state

    Every organization generates and gathers data, both internally and from external sources. The data takes many formats and covers all areas of the organization’s business (sales, marketing, payroll, production, logistics, etc.) External data sources include partners, customers, potential leads, etc. 

    Traditionally all this data was stored on-premises, in servers, using databases that many of us will be familiar with, such as SAP, Microsoft Excel, Oracle, Microsoft SQL Server, IBM DB2, PostgreSQL, MySQL, Teradata.

    However, cloud computing has grown rapidly because it offers more flexible, agile, and cost-effective storage solutions. The trend has been towards using cloud-based applications and tools for different functions, such as Salesforce for sales, Marketo for marketing automation, and large-scale data storage like AWS or data lakes such as Amazon S3, Hadoop and Microsoft Azure.

    An effective, modern BI and analytics platform must be capable of working with all of these means of storing and generating data.

    2. Extract, Transform, and Load: Prepare data, create staging environment and transform data, ready for analytics

    For data to be properly accessed and analyzed, it must be taken from raw storage databases and in some cases transformed. In all cases the data will eventually be loaded into a different place, so it can be managed, and organized, using a package such as Sisense for Cloud Data Teams. Using data pipelines and data integration between data storage tools, engineers perform ETL (Extract, transform and load). They extract the data from its sources, transform it into a uniform format that enables it all to be integrated. Then they load it into the repository they have prepared for their databases.

    In the age of the Cloud, the most effective repositories are cloud-based storage solutions likeAmazon RedShift,Google BigQuery, Snowflake, Amazon S3, Hadoop, Microsoft Azure. These huge, powerful repositories have the flexibility to scale storage capabilities on demand with no need for extra hardware, making them more agile and cost-effective, as well as less labor-intensive than on-premises solutions. They hold structured data from relational databases (rows and columns), semi-structured data (CSV, logs, XML, JSON), unstructured data (emails, documents, PDFs), and binary data (images, audio, video).  Sisense provides instant access to your cloud data warehouses.

    3. Data modeling: Create relationships between data. Connect tables

    Once the data is stored, data engineers can pull from the data warehouse or data lake to create tables and objects that are organized in more easily accessible and usable ways. They create relationships between data and connect tables, modeling data in a way that sets relationships, which will later be translated into query paths for joins, when a dashboard designer initiates a query in the front end. Then, users, in this case, BI and business analysts, can examine it, create relationships between data, connect and compare different tables and develop analytics from the data.

    The combination of a powerful storage repository and a powerful BI and analytics platform enables such analysts to transform live Big Data from cloud data warehouses into interactive dashboards in minutes. They use an array of tools to help achieve this.Dimension tables include information that can be sliced and diced as required for customer analysis ( date, location, name, etc.). Fact tables include transactional information, which we aggregate. TheSisense ElastiCube enables analysts to mashup any data from anywhere. The result: highly effective data modeling that maps out all the different places that a software or application stores information, and works out how these sources of data will fit together, flow into one another and interact.

    After this, the process follows one of two paths:

    4. Building dashboards and widgets

    Now,developers pick up the baton and they create dashboards so that business users can easily visualize data and discover insights specific to their needs. They also build actionable analytics apps, thereby integrating data insights into workflows bytaking data-driven actions through analytic apps. And they define exploration layers, using an enhanced gallery of relationships between widgets.

    Advanced tools that help deliver insights include universal knowledge graphs and augmented analytics that use machine learning (ML)/artificial intelligence (AI) techniques to automate data preparation, insight discovery, and sharing. These drive automatic recommendations arising from data analysis and predictive analytics respectively. Natural language querying puts the power of analytics in the hands of even untechnical users by enabling them to ask questions of their datasets without needing code, and to tailor visualizations to their own needs.

    5. Embed analytics into customers’ products and services

    Extending analytics capabilities even further, developers can create applications that they embed directly into customers’ products and services, so that they become instantly actionable. This means that at the end of the BI and analytics process, when you have extracted insights, you can immediately apply what you’ve learned in real time at the point of insight, without needing to leave your analytics platform and use alternative tools. As a result, you can create value for your clients by enabling data-driven decision-making and self-service analysis. 

    With a package like Sisense for Product Teams, product teams can build and scale custom actionable analytic apps and seamlessly integrate them into other applications, opening up new revenue streams and providing a powerful competitive advantage.

    Author: Adam Murray

    Source: Sisense

  • Toucan Toco introduceert realtime data-analyse voor franchiseformules

    Toucan Toco introduceert realtime data-analyse voor franchiseformules

    Interactieve tool geeft franchises datagestuurde inzichten om verkoop te stimuleren

    Toucan Toco, specialist in data storytelling, introduceert een bewezen effectieve analyse-oplossing voor franchise-ondernemingen nu ook in Nederland. De interactieve tool omvat een dashboard dat allerlei data voor franchiseformules visualiseert en daarmee belangrijke inzichten geeft om hun marktaandeel en klanttevredenheid te verbeteren en hun omzet te verhogen. Gebruikers profiteren bovendien van de kracht van data storytelling in de visualisatiemodule voor duidelijkere inzichten uit de bestaande data
    In tegenstelling tot conventionele franchisebeheersoftware biedt Toucan Toco een volledig interactieve oplossing voor het bijhouden van prestaties, waardoor franchisenemers snel toegang hebben tot conversiepercentages, verkoopgegevens, topproducenten en andere belangrijke prestatie-indicatoren. Resultaten en verschillen kunnen per franchise worden vergeleken en geanalyseerd. De analyses en data-inzichten kunnen vervolgens worden gebruikt om KPI-beheer aan te scherpen en continue optimalisatie te stimuleren, wat resultaten oplevert voor zowel individuele franchisenemers als de formule als geheel.
    Klanten die al met het dashboard werken, ervaren een stijging van het marktaandeel met 5 procent, een stijging van 7 procent in verkoopprestaties en een stijging van 3,5 procent in klanttevredenheid.
    Voor zowel franchiseformules als franchisenemers levert de analyse-oplossing voordelen op:


    Franchiseformules vertrouwen traditioneel op Excel-gebaseerde rapportagebestanden, waardoor het moeilijk is om slimme, datagestuurde beslissingen te nemen. Het datavisualisatieplatform van Toucan Toco vergelijkt KPI's en prestatiestatistieken tussen regio's, productlijnen, enzovoort, zodat franchises de prestaties continu kunnen inzien, begrijpen én verbeteren.  

    Betere communicatie

    Door gecentraliseerd databeheer beschikken managers door de hele organisatie, van hoofdkantoor tot lokale vestigingen, over dezelfde, begrijpelijke informatie en inzichten. Dit stelt managers in staat om sneller actie te ondernemen en slimmere te beslissingen nemen. 

    Nieuwe franchises aantrekken

    Met gedifferentieerde, gegevensgestuurde beheerprocessen is het voor nieuwe franchises gemakkelijker om te begrijpen wat er wordt verwacht, wat de huidige resultaten per vestiging zijn, de naleving te verifiëren en de aanwijzingen van het hoofdkantoor op te volgen. Dat resulteert in een eenvoudiger onboarding proces en betere communicatie over KPI’s en prestaties.

    Inzichten op elk device

    Franchise-medewerkers werken over het algemeen zonder vast bureau of PC, dus een mobile-first analyseplatform is essentieel. De oplossing van Toucan Toco is beschikbaar via mobiel, tablet en desktop, zodat medewerkers overal toegang hebben tot inzichten.
    Franchiseformules en franchisenemers profiteren bovendien van Toucan Toco’s expertise op het gebied van datacommunicatie voor beter begrip van data, ook door niet-dataspecialisten. Het geïntegreerde analyse-platform maakt gebruik van data storytelling voor eenvoudige en intuïtieve weergave van data in begrijpelijke, interactieve visualisaties, zodat organisatiebreed datagedreven beslissingen kunnen worden genomen.  
    “Een breed begrip van data door medewerkers en krachtige communicatie met data storytelling spelen een belangrijke rol in het verbeteren van de prestaties van franchiseformules en individuele franchisevestigingen”, zegt Baptiste Jourdan, mede-oprichter van Toucan Toco. “De analyse-oplossing voor franchiseformules is een mooie stap in onze missie om organisaties te helpen het maximale uit hun data te halen en alle medewerkers in staat te stellen om sneller de juiste beslissingen te nemen.”
    Bron: Toucan Toco
  • Using social media data to analyze market trends

    Using social media data to analyze market trends

    Market trend analysis is an indispensable tool for companies these days. Social media gives analysts access to data that might otherwise be tough to collect. Rapidly changing business conditions require deep insight, and a market trend analysis report is a critical tool. Aside from future-proofing businesses, trend analysis reports also help companies tune into current dynamics and create better products or services.

    There are many tools and data sources trend analysts use to prepare a market analysis report. However, social media data offers the most fertile ground. Today there are over 4.5 billion social media users worldwide. That’s over half the world’s population accessing social media and interacting with content.

    Social media data is even more valuable because of the high costs of generating original research from scratch. In essence, social media platforms offer all the data companies need, and cost-effectively. Here are three major insights that market trend analysts can derive from social media data.

    Consumer Preferences

    Every business lives and dies with its customers, and assessing consumer preferences is a tough task. While existing customers often make their intentions clear with their purchase patterns over time, market trends often shift and push potential future customers away from a brand’s messaging.

    “Usually, the first signs of a shift (in market trends) show themselves through social media or engagement metrics,” writes SimilarWeb’s Daniel Schneider in a recent post on market trend analysis. “This crucial rise or fall in traffic, engagement, or variation in demographics is what reveals your competitive advantage.” In this context, competitive advantage refers to a company or brand’s position in the market and its appeal to consumers, relative to how its competitors are perceived in “the conversation.”

    Social media engagement data offers a wealth of insight in this regard. For instance, high-level data such as the number of comments or likes, and engagement per hashtag, provide companies insight into which topics niche consumers are interested in. Monitoring the trends in these metrics also reveals broader market shifts.

    A company’s engagement rate trend and conversion ratios offer insight into marketing effectiveness over time. In the same way that a decreasing sales conversion rate over time points to a possible disconnect with consumers, so too does a falling follower or subscriber count.

    Thanks to rising social awareness, companies are expected to take stands on important issues these days. Monitoring the usage of hashtags related to these issues, keeping an eye on trending topics, and tracking engagement metrics on content that addresses these issues helps companies easily tune into the current market climate.

    When compared to conducting surveys or polls, there’s no doubt that social media data removes biases and presents user opinions in a useful manner.

    Seasonal Trends

    Many industries are subject to seasonal trends, and market analysts need to figure them out. The consequences of predicting an incorrect trend can be catastrophic, thanks to production and procurement schedules tied to seasonal demand. 

    A market trend analysis that mines social media demographic data will uncover seasonal trends at multiple levels. At a high level, trend analysts can figure out who their customers are and what their tendencies look like. Platforms such as Facebook’s Ad Manager provide a wealth of information, right down to the type of devices the user prefers and even their political leanings.

    Analysts can dig deeper into these data and uncover specific data points that help them segment their customer audience. For instance, customers older than 50 might prefer a product during fall, but a younger audience might prefer it during spring. By providing demographic data, trend analysts can help their companies meet demand intelligently.

    Market trend reports informed by such data help companies anticipate trends that might develop in the future. As strategic business advisor Bernard Marr points out, “By practicing market analysis, you can stay on top of which trends are having the most influence and which direction your market is headed — before any major changes take place — leaving you well placed to surpass your competition.”

    Social media data provides companies an easy way to access data that points to major trend changes. Demographic data allows companies to isolate audiences who might form a future customer base and figure out their preferences in advance. In turn, this helps them create production schedules that match that audience’s seasonal preferences.

    Market Dynamics

    The market a business operates in is subject to a variety of forces. Chief among these is competitor activity. Disruptive products introduced by competitors can seriously harm a company’s earning ability. A famous example of this is Apple eliminating the likes of Palm and Blackberry within a few years after the release of the iPhone.

    Monitoring a brand’s social share of voice and comparing that to its competition helps trend analysts figure out who’s occupying the top of users’ minds in the market. Analysts can also correlate these trends to sales volumes and connect product improvements, marketing strategies, and discover broad market trends. These data also help companies build lasting relationships with their customers.

    Given the fast pace with which consumer preferences change these days, traditional data-gathering techniques will leave companies playing catch-up. “Because so much of the world is sharing its opinions on every subject at all hours of the day, trends and markets can shift quickly,” says Meltwater’s Mike Simpson. “It is not just the customer of next year or next month that organizations need to consider — but the customer of the next day.”

    Whether it’s trends in engagement, demographics, or competitor data, social media data helps analysts gain perspective on how the market is headed.

    A Full Picture

    Social media platforms offer a treasure trove of user data. Market trend analysts can mine these data continuously to connect business performance and consumer behavior. Social media gives companies a real-time, cost-effective look into their customers’ minds compared to traditional data-gathering methods.

    Author: Ralph Tkatchuk

    Source: Dataconomy

  • Verschillende perspectieven om te kijken naar de transitie richting de cloud

    Verschillende perspectieven om te kijken naar de transitie richting de cloud

    Onlangs belegde CIO samen met Juniper Networks een bijeenkomst in de oude verkeerstoren in Schiphol-Oost. Samen met zeven genodigden bespraken ze, heel toepasselijk in deze omgeving, de gang naar de cloud. Daarover verschillen de meningen nogal, zo bleek.

    We voeren de discussie aan de hand van drie stellingen. De eerste heeft te maken met puur de connectiviteit van een netwerk, de tweede met de security en de derde met de automatisering. Op deze manier beginnen we met de basis, de connectiviteit, waarna we kijken hoe dit beveiligd moet worden en tot slot hoe connectiviteit en security kunnen worden geautomatiseerd.

    Stelling 1: Een netwerk is een netwerk, ongeacht het deployment model

    Deze eerste stelling bespreken we met Naomi Du Burck, manager front office IT Operations bij de ANWB in Den Haag en met Peter Verdiesen, als hoofd ICT werkzaam bij Countus accountants en adviseurs uit Zwolle. We treffen hierbij meteen twee mensen uit bedrijfstakken waar men bij het horen van public cloud vooral uitdagingen ziet. Het maakt voor beide organisaties dus nog wel degelijk uit hoe het netwerk eruitziet.

    De AVG is voor beide gesprekspartners zonder twijfel de meest voor de hand liggende rode vlag als het gaat om de cloud. Zowel de ANWB als Countus hebben te maken met veel data van leden/cliënten. 'Dat maakt cloud in het algemeen tamelijk ingewikkeld', zijn beiden het eens. Ze mogen hun data niet zomaar in de cloud zetten. De ANWB heeft daarnaast veel legacy. 'Dan valt de optie voor public in de huidige situatie eigenlijk af', aldus Du Burck.

    Het netwerk en de infrastructuur als geheel is voor zowel de ANWB als Countus ondergeschikt aan hoe men om mag gaan met de data van gebruikers. Dit geldt ook voor de telemetrie en analytics die je uit het netwerk kunt halen als je de zaken continu monitort. Verdiesen: 'We willen hier wel graag bedrijfsbreed mee aan de slag, maar mogen dit niet zonder extra AVG-maatregelen die het veel complexer maken'. Hij ziet namelijk wel degelijk dat hier zeer waardevolle inzichten uit gehaald kunnen worden.

    Het is overigens niet zo dat analytics per definitie niet mag: 'Een op een mag het wel, tussen een enkele klant en Countus, maar opschalen mag niet zonder toestemming van alle betrokkenen', aldus Verdiesen. Tot de wetgeving aangepast wordt, zitten ze hieraan vast, is zijn conclusie.

    Ook al durft Du Burck te stellen dat 'de ANWB nooit volledig public gaat', wordt er natuurlijk wel degelijk naar een hybride vorm gekeken. Denk hierbij aan het verzorgen van de werkplekken voor werknemers. Office 365, maar ook het op afstand gebruik kunnen maken van de omgeving van de ANWB is iets waar men mee bezig is. Men moet immers wel mee in de ontwikkelingen rondom een moderne werkplek.

    Tot slot geeft Du Burck aan dat er bij de overgang naar een public cloud nog veel meer zaken onderzocht, afgesproken en gewijzigd moeten worden voordat dit gerealiseerd kan worden. Denk bijvoorbeeld maar aan wijzigingen in governance, beheer, beleid, budgetten en ga zo maar door.

    Stelling 2: Private betekent meer controle op het gebied van privacy en compliancy

    De tweede stelling bespreken we met Erik van der Saag, sectormanager ICT bij de Tabijn scholengroep en met Duncan Megens, evenals Du Burck werkzaam bij de ANWB, maar dan als manager backend - IT operations. We hebben hier twee radicaal van elkaar verschillende organisaties aan tafel. Bij Tabijn zit men volgens Van der Saag al voor 80% in de cloud, terwijl dat bij de ANWB nog geen 10% is.

    We raken al snel verzeild in een welhaast filosofische discussie over wat we verstaan onder controle. Megens: 'Wat bedoel je met controle? Dat kan ik namelijk op meerdere manieren interpreteren. Gaat het dan om de theoretische controle die je hebt over je netwerk, of over hoe goed je het daadwerkelijk onder controle hebt? Dat is namelijk nogal een verschil, ook voor de business als geheel'.

    Als voorbeeld pakken we Office 365. Bij Tabijn is men bezig om de overstap te maken en ook bij de ANWB zijn de voorbereidingen voor de overgang in gang gezet. 'Je geeft wel degelijk controle op als je naar Office 365 gaat', volgens Van der Saag, iets waar Megens het roerend mee eens is. Toch heb je er gezien het cloud-karakter van de dienst juist wel weer controle over als je kijkt naar patching en dergelijke, afhankelijk van hoe je het inricht.

    Onder de streep heeft het volgens beide heren weinig zin om het over controle te hebben als je het niet hebt over hoe iets ingericht is. Van der Saag geeft als voorbeeld het exporteren van gegevens van leerlingen naar de cloud. Dat doet Tabijn zeker, maar er zijn hier wel voorwaarden waaraan men zich moet houden: 'Gegevens van leerlingen mogen niet zomaar meer geëxporteerd worden. Daar moet weer een laag tussen die ervoor zorgt dat de data ook veilig zijn'. Uiteindelijk heeft Tabijn hier in de cloud net zoveel controle op het gebied van privacy en compliancy als het on-premise zou hebben.

    De conclusie van deze discussie is dan ook dat de stelling niet per definitie waar is. Als je het netwerk en de infrastructuur goed inricht, maakt het niet uit waar je data staan en waar je applicaties draaien. Dit is uiteindelijk ook een kwestie van vertrouwen. Vaak is er nog altijd het gevoel dat data minder veilig zijn buiten de muren van je eigen omgeving, maar dat hoeft niet per sé zo te zijn. Dit zal ongetwijfeld ook te maken hebben met een generatiekloof, dus op termijn zal het gevoelsargument minder vaak gemaakt worden.

    Stelling 3: Ontzorgen doe je in de cloud

    Voor de derde en laatste stelling schuiven we aan bij Daniel Treep, architect bij KPN en bij Martijn Jonker van Andarr Technology. KPN zal weinig introductie behoeven, Andarr is een bedrijf dat naar eigen zeggen 'niet voor watjes' is en biedt ICT-consultancy en detacheringsdiensten aan organisaties.

    We zijn het er aan tafel vrij snel over eens dat ontzorgen wellicht niet de best gekozen term is. Als je vanuit private een transitie gaat maken, dan hebben de meeste klanten juist het gevoel dat er juist complexiteit toegevoegd wordt, zeker als het gaat om het netwerk. Je moet ineens allerlei verschillende platformen in een netwerkarchitectuur zien te gieten. Je kunt dan moeilijk zeggen dat je ontzorgd bent, eerder het tegenovergestelde. Je zou dit overigens ook kunnen zien als een transitiefase, waar je even doorheen moet. Dat menen we in ieder geval te proeven uit de opmerking van Jonker dat 'het aanbieden van alles in de cloud de ultieme droom is qua volwassenheid'.

    Volgens Treep maakt het nogal iets uit wat je afneemt in de cloud. Bij SaaS neem je simpelweg een totale dienst af, die jou als het goed is ontzorgt. Daar wil Jonker overigens wel meteen een kanttekening bij plaatsen, want er is maar weinig vastgelegd over hoe services aangeboden moeten worden in de cloud. 'Als het goed is, wordt een dienst zo ingericht dat je er bijvoorbeeld niet zonder wachtwoord bij kan, maar er is geen enkele verplichting om dat ook te doen'. De zorgen kunnen dus niet volledig het raam uit bij het afnemen van een SaaS, volgens Jonker.

    In tegenstelling tot SaaS, heb je volgens Treep bij PaaS en IaaS nagenoeg dezelfde zorgen als bij andere deployment-modellen. Daar is Jonker het mee eens: 'een programmeur kan ermee doen wat hij wil en een enorm datalake creëren waar je geen overzicht meer over hebt'.

    Volgens Treep is het onder de streep eenvoudig als het gaat om het uit handen nemen van zaken in de cloud. 'Hoeveel controle krijg je over het platform? Daar draait het uiteindelijk om'. Heb je veel controle, dan kun je er ook voor zorgen dat je het zodanig inricht dat je er weinig zorgen over hebt. Automation speelt hierin een duidelijke rol: 'Automation is de basis van welke cloud-benadering dan ook'.

    Automation en ontzorging hebben als zodanig het nodige met elkaar te maken, dus in die zin zou je kunnen zeggen dat de stelling conceptueel hout snijdt, ook al maken beide heren de nodige kanttekeningen.

    Interpreteer je ontzorging zo dat al je zorgen voorbij zijn als IT manager, dan kom je toch van een koude kermis thuis, denken beide heren. 'Je ruilt je huidige zorgen in voor andere zorgen in de cloud', is de duidelijke conclusie.

    Conclusie: Verschillende tempo's en einddoelen

    Discussies zoals we die hierboven hebben beschreven, tussen managers van uiteenlopende organisaties, leveren altijd een mooie dwarsdoorsnede van de markt op. Als het gaat om de gang naar de cloud, is het duidelijk dat niet iedere organisatie even snel gaat, maar ook dat niet iedere organisatie hetzelfde einddoel heeft of zou moeten hebben.

    Ben je een organisatie waarbinnen men vanuit de infrastructuur en dus ook het netwerk denkt bij het denken over veranderingen, dan ben je veel eerder geneigd om positief te zijn over de stellingen. Je denkt dan kort door de bocht dat het middels het juist inregelen van de verschillende interfaces prima mogelijk moet zijn om een groot logisch netwerk te maken, waarbinnen je controle hebt en alles kunt automatiseren.

    Ben je als organisatie vooral druk met persoonlijke data en zijn je applicaties veel belangrijker dan je infrastructuur en je netwerk, dan zal je minder positief zijn. Dat is ook niet meer dan logisch, omdat het dan geen technische exercitie is. De AVG gaat bijvoorbeeld niet of nauwelijks over technologie.

    Ook bij dit type organisatie kan wel degelijk de gang naar de cloud ondernomen worden, maar daarvoor moeten er dan veel meer extra maatregelen genomen worden. Bij een scholengemeenschap zoals Tabijn is dat bijvoorbeeld iets overzichtelijker dan het bij de ANWB is, om maar een dwarsstraat te noemen.

    Wel is het wat ons betreft zo dat je je af kunt vragen of iedere organisatie de 'ultieme droom van volwassenheid' waar Martijn Jonker van Andarr het over had moet willen nastreven. In sommige gevallen zal dit een droom blijven of altijd als een nachtmerrie worden gezien. Laten we verder hopen dat de prestaties tijdens het vliegen in de flight simulator voor sommigen geen voorbode zijn van hoe de transitie naar de cloud zal uitpakken.

    Auteur: Sander Almekinders

    Bron: CIO

  • Vijf manieren waarop business intelligence verandert in 2021

    Vijf manieren waarop business intelligence verandert in 2021

    Elke beslissing die in een bedrijf wordt genomen, moet bijdragen aan een positieve uitkomst. Er is geen ruimte meer voor verzet, afwachten, of beslissingen op gevoel. Bedrijven moeten zich aanpassen en snel reageren. Dit vraagt om datagedreven beslissingen en gemakkelijke inzichten uit data. Yann Toutant, country manager Benelux bij Toucan Toco, specialist in data storytelling, ziet vijf manieren waarop business intelligence verandert zodat iedereen in een organisatie in staat gesteld wordt om datagedreven beslissingen te nemen. 

    1. Dagelijkse KPI-inzichten, op elk device

    Bedrijven zijn eraan gewend om KPI's op kwartaalbasis te analyseren. Door de uitdagingen die het afgelopen jaar ontstonden, bleek dit echter niet voldoende: organisaties hebben nu behoefte aan wekelijkse, of zelfs dagelijkse analyses en communicatie voor broodnodige inzichten en bijsturing. Dit betekent een hogere productiesnelheid van rapportages. Niet alle organisaties zijn hier al op voorbereid. Zeker wanneer rapportages nog worden opgemaakt in Excel en via PowerPoint of mail worden gedeeld, ontstaat vertraging. Dit onderstreept de behoefte aan business intelligence-tools, waarin inzichten bovendien real time beschikbaar moeten zijn, op alle devices.

    2. Snackable data

    IT-afdelingen, BI-tools en data experts maken de in een organisatie aanwezige data beschikbaar voor de rest van de organisatie. Deze data worden als een gigantisch all-you-can-eat-buffet gepresenteerd, met een overvloed aan opties. Door de enorme keuze verliezen de eindgebruikers van die data echter het overzicht. Ze verdrinken in data en, naarmate de complexiteit ervan toeneemt, worstelen om prioriteiten te stellen en actie te ondernemen. Wat zij eigenlijk willen, is een snack in plaats van een buffet: een op hun behoefte afgestemde portie op een door hen zelf te bepalen moment. 
    Hierin is context cruciaal. Door alleen de belangrijkste informatie beschikbaar te maken en deze in de juiste context te visualiseren, kan iedereen in één oogopslag zien wat de prestaties zijn. 

    3. Één scherm, één inzicht

    In het dagelijks leven zijn mensen gewend aan het gebruik van apps zoals Uber en Spotify. Deze zijn zo intuïtief vormgegeven dat gebruikers niet getraind hoeven te worden in de werking ervan. In het bedrijfsleven worden echter nog gigantische rapporten en enorme Excel-bestanden gebruikt die een groot zoekplaatje vormen. Deze manier van datapresentatie leidt niet tot (snelle) inzichten. Een effectieve datapresentatie moet dus net zo intuïtief zijn als de apps die we dagelijks gebruiken. Een duidelijke regel daarbij is ‘één scherm, één inzicht’.

    4. Focus op datacommunicatie

    Ondanks moderne analyse- en business intelligence-platforms zijn inzichten uit data niet gecontextualiseerd, gemakkelijk te consumeren of bruikbaar voor de meerderheid van de zakelijke gebruikers. Dit komt doordat business intelligence tools zijn gebouwd voor data exploratie, niet voor datacommunicatie. Datacommunicatie is de laatste stap in de data-uitdaging van organisaties. Het doel is om eindgebruikers te voorzien van één specifiek inzicht, waarbij eenvoud het sleutelwoord is. Data storytelling maakt dit mogelijk door context toe te voegen aan datavisualisatie, zodat inzichten uit data gemakkelijk en organisatiebreed beschikbaar worden. 
    De bedrijven die erin slagen inzichten uit data op zo’n manier te communiceren naar de eindgebruikers dat zij tot handelen worden aangezet, hebben meer kans succesvol te zijn.

    5. Afname van dashboard-gebruik

    Het gebruik van dashboards en rapporten gericht op visuele verkenning nam meer dan 20 jaar toe, maar neemt de laatste tijd juist af. Dashboards zijn complex en niet gemaakt voor gebruikers die geen analist zijn. Er is behoefte aan een duidelijkere vorm van communicatie, waarin de focus verschuift van data-analisten naar de eindgebruikers van data. Dashboards worden daarom vervangen door meer geautomatiseerde en consumentgerichte ervaringen in de vorm van dynamische dataverhalen. Deze plaatsen data in context om zo te zorgen voor inzichten en analyse bij de eindgebruiker. Dit verandert de manier waarop en waar gebruikers omgaan met inzicht monitoring en analyse.
    In 2021 moet elke beslissing datagedreven zijn en moeten alle medewerkers snel toegang hebben tot data. Doordat datavisualisatie en -communicatie eindgebruikers steeds meer te bieden hebben, verwacht Toutant dat organisaties de data die zij tot hun beschikking hebben nog beter weten te benutten en onverwachte inzichten kunnen blootleggen. 
    Auteur: Yann Toutant
    Bron: Toucan Toco
  • Welke aanpassingen vraagt een toekomst met 5G internet?

    Welke aanpassingen vraagt een toekomst met 5G internet?

    Het snelle vijfde generatie mobiele internet (5G) is over een aantal jaar realiteit. Het belooft snellere download- en uploadsnelheden, meer capaciteit en stabielere verbindingen. Hoewel velen de voordelen zien, moeten we de maatschappelijke transformatie die daarmee gepaard gaat niet onderschatten.

    Naast de impact die 5G voor bedrijven zal hebben, zal de hele maatschappij zich moeten aasnpassen. Bij 5G en de toepassingen die we hierbij voor ogen hebben is het niet eens de snelheid die cruciaal is, maar de betrouwbaarheid en de consistentie van de verbinding. Dit vraagt het een en ander aan infrastructuur. Met meer dan 75 miljard apparaten die wereldwijd aan het internet verbonden zijn in 2025, neemt de hoeveelheid data gigantisch toe, en de benodigde capaciteit dus ook. De implicaties van het nieuwe netwerk zijn groter dan op het eerste oog zichtbaar is. Wat brengt 5G nog meer met zich mee?

    1. Organische bediening

    We gebruiken onze smartphone om te sporten, lezen, muziek te luisteren etc. Maar we verbeteren er ook onze gezondheid mee. Wat echter niet gezond is, is dat we gerust uren per dag naar een scherm staren. De apps om een tijdslimiet in te stellen vinden dan ook gretig aftrek, de liefde voor het oneindig scrollen begint te bekoelen. 5G is hierbij een welkome innovatie. Door de vermindering in latency, en dus de verbetering van de reactiesnelheid, krijgt onze duim meer rust: we bedienen onze telefoon meer met stem en gebaren.

    Met een volledig naadloos en onzichtbaar netwerk dat alle apparaten draadloos met elkaar verbindt, worden gegevens met hoge snelheden overgedragen en opgeslagen. Zo blijft de techniek ons nog steeds ondersteunen in praktisch alles wat we doen, maar in een meer natuurlijke vorm. De smartphone zal zeker blijven, maar waarschijnlijk op een meer organische, onzichtbare manier.

    2. Meer data, meer datacenters

    Wanneer we massaal overstappen op 5G, stelt dat ook bepaalde eisen aan de infrastructuur van het netwerk. Een 5G verbinding kan data bijna duizend keer sneller verplaatsen dan het glasvezelnetwerk van nu. En met de verwachte hoeveelheid verbonden apparaten die elk jaar blijft groeien, is er binnen korte tijd aanzienlijk meer data in omloop. Om deze gegevens met hogere snelheden betrouwbaar te verzenden, is flink meer capaciteit nodig.

    Vergelijk de verwerking van deze data bijvoorbeeld eens met watertoevoer. Wanneer je vijftig liter water nodig hebt, bedenk je je eerst hoe ver de waterbron is. Daarna stel je de vragen; hoe dik is de slang en hoeveel constante druk wordt er uitgeoefend om het water eruit te pompen? Zo kun je data ook bekijken. Voldoet het huidige netwerk nog wel aan de benodigde hoeveelheden data en de druk die daarvoor nodig is?

    De huidige en toekomstige netwerkinfrastructuur moet evolueren om 5G de ondersteuning te kunnen bieden die het nodig heeft: betrouwbaar, consistent en snel. Het antwoord hierop ligt deels in edge computing; met micro datacenters kan een deel van de druk van de ketel worden gehaald. Hiermee verplaats je in feite de verwerking van data deels richting het eindstation, wat ruis en vertraging vermindert. Hoewel edge computing niet voor elk doel geschikt is, kan het zeker bijdragen aan de verdeling van de lasten. Vooral in een wereld vol IoT (Internet of Things).

    3. Privatisering van de netwerken

    Wanneer het over 5G en alle bijbehorende mogelijkheden gaat, worden vaak de meest extreme voorbeelden genoemd, zoals de zelfrijdende auto. Maar denk ook eens aan de chirurg die op afstand een operatie uitvoert met een robotarm, of de brandweerman die branden bestrijdt met behulp van een supersnelle, real-time internetverbinding. Die laatste is essentieel om met de juiste snelheid te reageren en anticiperen.

    Bij zulke gevallen, waarin het letterlijk om een kwestie van leven of dood kan gaan, is het voorkomen van jitter (ruis) cruciaal. Wanneer de verbinding ook maar een seconde hapert, kan dat voor de patiënt, tegenligger of het slachtoffer te laat zijn. Wat hier nodig is, is een gesloten, speciaal voor zulke doeleinden ingericht netwerk, ver verwijderd van de storingen vanuit andere applicaties.

    Om dit te realiseren is mogelijk zelfs nieuwe regelgeving vereist, waarin het onderscheid moet worden gemaakt tussen technologie die invloed heeft op gezondheid en veiligheid en technologie die er puur voor entertainment is. De risico's zijn nou eenmaal niet hetzelfde.

    Auteur: Petran van Hugten

    Bron: CIO

  • What exactly is a Data Fabric? Definitions and uses

    What exactly is a Data Fabric? Definitions and uses

    A Data Fabric “Is a distributed Data Management platform whose objective is to combine various types of data storage, access, preparation, analytics, and security tools in a fully compliant manner to support seamless Data Management.” His concept has gained traction as technologies, such as the Internet of Things, need to have a consistent way of making data available to specific workloads or applications. It is key for retrieving data across multiple locations spanning the globe, since many companies use a variety of storage system configurations and cloud providers.

    Other definitions of a Data Fabric Include:

    • “A solution to the phenomenon where datasets get so large that they become physically impossible to move.” (Will Ochandarena)
    • “A comprehensive way to integrate all an organization’s data into a single, scalable platform.” (MAPR)
    •  “An enabler of frictionless access of data sharing in a distributed data environment.” (Gartner)
    • “An information network implemented on a grand scale across physical and virtual boundaries – focus on the data aspect of cloud computing as the unifying factor.” (Forbes)
    • A design allowing for “a single, consistent data management framework, allowing easier data access and sharing in a distributed environment” (TechRepublic)

    Businesses use a Data Fabric to:

    • Handle very large data sets across multiple locations quicker.
    • Make data more accessible
    • Optimize the entire data lifecycle– to enable applications that require real-time analytics.
    • Integrate data silos across an environment
    • Deliver a higher value from data assets
    • Allow machine learning and AI to work more efficiently

    Author: Michelle Knight

    Source: Dataversity

  • What to expect from data decade 2020-2030?

    What to expect from data decade 2020-2030?

    From wild speculation that flying cars will become the norm to robots that will be able to tend to our every need, there is lots of buzz about how AI, Machine Learning, and Deep Learning will change our lives. However, at present, it seems like a far-fetched future. 

    As we enter the 2020s, there will be significant progress in the march towards the democratization of data that will fuel some significant changes. Gartner identified democratization as one of its top ten strategic technology trends for the enterprise in 2020 and this shift in ownership of data means that anyone can use the information at any time to make decisions.

    The democratization of data is frequently referred to as citizen access to data. The goal is to remove any barriers to access or understand data. With the explosion in information generated by the IoT, Machine Learning, AI, coupled with digital transformation, it will result in substantial changes in not only the volume of data but the way we process and use this intelligence.

    Here are  four predictions that we can expect to see in the near future:

    1. Medical records will be owned by the individual

    Over the last decade, medical records have moved from paper to digital. However, they are still fragmented, with multiple different healthcare providers owning different parts. This has generated a vast array of inefficiencies. As a result, new legislation will come into effect before the end of 2023 that will allow people to own their health records rather than doctors or health insurance companies.  

    This law will enable individuals to control access to their medical records and only share it when they decide. By owning your health golden data record, all of the information will be in one centralized place, allowing those providers that you share this information with to make fully informed decisions that are in your best interest. Individuals will now have the power to determine who can view their health records and this will take the form of a digital twin of your files. When you visit a doctor, you will take this health record with you and check it in with the health provider and when you check out, the provider will be required to delete your digital footprint. 

    When you select medication at CVS, for example, the pharmacist will be able to scan your smart device to see what meds you are taking and other health indicators and then advise if the drug you selected is optimal for you. This will shift the way we approach healthcare from a reactive to a personalized preventative philosophy. Google has already started on this path with its project Nightingale initiative with the goal of using data machine learning and AI to suggest changes to individual patents care. By separating the data from the platform, it will also, in turn, fuel a whole new set of healthcare startups driven by predictive analytics that will, in time, change the entire dynamics of the healthcare insurance market. This will usher in a new era of healthcare that will move towards the predictive maintenance of humans, killing the established health insurance industry as we know it. Many of the incumbent healthcare giants will have to rethink their business model completely. However, what form this will take is currently murky. 

    2. Employee analytics will be regulated

    An algorithm learns based on the data provided, so if it’s fed with a biased data set, it will give biased recommendations. This inherent bias in AI will see new legislation introduced to prevent discrimination. The regulation will put the onus on employers to ensure that their algorithms are not prejudiced and that the same ethics that they have in the physical world also apply in the digital realm. As employee analytics determine pay raises, performance bonuses, promotions, and hiring decisions, this legislation will ensure a level playing field for all. As this trend evolves, employees will control their data footprint, and when they leave an organization rather than clearing out their physical workspace, they will take their data footprint with them.

    3. Edge computing: from niche to mainstream

    Edge computing is dramatically changing the way data is stored and processed. The rise of IoT, serverless apps, peer2peer, and the plethora of streaming services will continue to fuel the exponential growth of data. This, coupled with the introduction of 5G, will deliver faster networking speed enabling edge computing to process and store data faster to support critical real-time applications like autonomous vehicles and location services. As a result of these changes, by the end of 2021, more data will be processed at the edge than in the cloud. The continued explosive growth in the volume of data coupled with faster networking will drive edge computing systems from niche to mainstream as data will shift from predominantly being processed in the cloud to the edge.

    4. Machine unlearning will become important

    With the rise in intelligent automation, 2020 will see the rise of machine unlearning. As the volume of data sets continues to grow rapidly, knowing what learning to follow and what to ignore will be another essential aspect of intelligent data. Humans have a well-developed ability to unlearn information; however, machines currently are not good at this and are only able to learn incrementally. Software has to be able to ignore information that prevents it from making optimal decisions rather than repeating the same mistakes. As the decade progresses, machine unlearning where systems unlearn digital assets will become essential in order to develop secure AI-based systems.

    As the democratization of intelligent data becomes a reality, it will ultimately create a desirable, egalitarian end-state where all decisions are data-driven. This shift, however, will change the dynamics of many established industries and make it easier for smaller businesses to compete with large established brands. Organizations must anticipate these changes and rethink how they process and use intelligent data to ensure that they remain relevant in the next decade and beyond.

    Author: Antony Edwards

    Source: Dataconomy

  • Which types of analytics do business use?

    Which types of analytics do business use?

    Data analytics in businesses help uncover competitive intelligence, actionable insights and trends. Different types of analytics enable businesses to gain an edge over their competitors. In the “data first” era, data-driven insights and decisions have become the key drivers of business performance. This post reviews the different types of data analytics routinely used in enterprises.

    The Raw Data for Different Types of Analytics

    Every business collects a vast range of data, namely sales data, supply chain data, customer data, employee (HR) data, transactional data, and much more. The data sources or channels can be numerous—sensors, applications, surveys, emails, or chats. The data type can be structured, semi-structured, or unstructured. Businesses have to rely on data analytics to make sense of huge piles of collected data.

    The raw data is collected, cleaned, and prepared, then analyzed for result-oriented outcomes. Data analytics can be applied to different business functions in different ways: predictions for the future; audience behavior trends for marketing; quarterly sales trends and patterns; viewer analytics for websites; customer feedback trends in social media, and so on.

    Many types of data analytics are presently used across sectors like healthcare, banking, insurance, fintech, HR, and manufacturing.

    The Various Types of Data Analytics

    Career Foundry guide states: “In some ways, data analytics is a bit like a treasure hunt; based on clues and insights from the past, you can work out what your next move should be.”

    This guide also breaks up each type of data analytics as a series of questions about data, which makes the surrounding definitions crystal clear. Fortunately, global businesses have four basic types of data analytics available at their disposal for a wide variety of purposes. These four types are categorized as descriptive, diagnostic, predictive, or prescriptive analytics. If you are completely new to data analytics, this blog post is a good place to begin.

    The different types of analytics:

    • Descriptive analytics is commonly regarded as the simplest type of data analytics, descriptive analytics help explain what happened in the past. This type of analytics is especially helpful in understanding customer preferences and choices or which products or services performed well. Some examples: reports, descriptive statistics, and data dashboard.
    • Diagnostic analytics looks at and attempts to analyze the “whys” of past events. In other words, diagnostic analytics investigates why certain things happened the way they did. This type of analytics can be helpful in identifying problems currently present within business operations. Some examples: data mining, data discovery, and correlations.
    • Predictive analytics relies on historical data (trends, patterns, logs) to make predictions about the future. This type of data analytics can help in anticipating and planning for future problems, for example, risk assessment, demand forecasting, patient care outcomes. Predictive analytics can also help uncover probable opportunities for business growth and profit. Usually statistical models like decision trees, regression models, or neural networks use past data to predict future outcomes. Some examples include fraud detection, custom recommendations, risk analysis, and inventory forecast.
    • Prescriptive analytics, considered the most complicated type of data analytics, will not only make future predictions but also recommend remedial actions for positive outcome; for example, risk mitigation. This type of analytics, requiring high volumes of data, can also be time-consuming and costly. Some examples: lead scoring, investment aids, and content recommendation for social apps. Here are some prescriptive analytics use cases.

    Businesses first need to determine which type of data analytics they need for a particular situation before expecting the benefits. 

    Use of AI and ML with Data Analytics

    In the artificial intelligence (AI) era, one cannot think of any data-driven operation without the presence of AI or machine learning (ML). ML algorithms make use of artificial intelligence to learn how to predict by studying high volumes of past data. On the other hand, hybrid models combine a number of predictive analytics techniques to deliver accurate predictions. After selecting one or more models, businesses have to train the model with available data. The data often comes from a combination of internal and external sources.

    In AI-powered predictive analytics platforms, the trained model is used to predict future outcomes. The actionable insights can be used to develop marketing campaigns, set prices for new products, or plan investments.

    Social Data Analytics: Use of Social Media

    With the rise of social-media channels for online shopping, two other types of data analysis have surfaced alongside traditional data analytics. These are sentiment analysis and customer behavior analysis. Businesses are now able to collect large volumes of customer behavior data in the form of  likes, tweets, or comments. According to this article about social media analytics (SMA), SMA indicates an “approach of collecting data from social media sites” for making enhanced business decisions. This process involves deeper analysis of social data.

    Customer behavior analysis: Popular communication channels like emails, chat scripts, video-conferencing logs, and online feedback add to the endless cycle of customer behavior data. Smart business operators collect, store, and routinely analyze this data to better understand their customers—their likes, dislikes, tastes, and preferences.

    Sentiment Analysis: This unique type of data analysis is used to measure the collective sentiment of a certain group of audience. Sentiment analysis helps to dive deep into customer behavior. This type of analytics can be particularly useful for marketing or customer service.

    An essential step to “measuring social media success” is to align the goals of your social marketing strategy with developed KPIs.

    Data Analytics in Action: The Basic Advantages

    A common misconception about data analytics is that huge volumes of data are required for every type of analytics. The truth is even simple spreadsheet data in combination with descriptive analytics can lead to valuable insights. Data analytics offer these obvious gains to businesses:

    • 360-degree view of and better understanding of customers
    • Enhanced customer service
    • Improved business performance (revenue, sales, customer base)
    • Timely, actionable insights
    • Competitive market intelligence
    • Improved products and services
    • Optimized business operations

    Summary on Types of Analytics

    According to this Geeks for Geeks article, “It is critical to (build a data analytic infrastructure) that provides a flexible, multi-faceted analytical ecosystem, optimized for efficient ingestion and analysis of large and diverse data sets.”

    In addition to all the benefits discussed above, data analytics is currently the most critical driver of a data-first business ecosystem. Data analytics is widely used across sectors, market segments, and various business types and sizes. Data analytics is one core activity that enables a business to make better decisions, drive performance, optimize resources, and understand customers. 

    Author: Paramita (Guha) Ghosh

    Source: Dataversity

  • Why BI reporting is superior to traditional reporting

    Why BI reporting is superior to traditional reporting

    The pandemic has caused a major change in the way we do business. Some organizations had already begun the digitization journey before the pandemic hit, providing them with a head start or those that have managed to rapidly digitize has helped their business survive and enabled people to work remotely. Some of the requirements included using cloud technology to store and analyze large volumes of data which can be accessed by employees, partners and other stakeholders. People cannot afford to wait for weekly or monthly reports to make critical company decisions or need to see this information at home.  The ability to generate accurate, relevant and timely reports is critical if a company is to remain agile. In this post, we will discuss a few ways BI reporting is superior to traditional reporting practices.  

    The ability to turn raw company data into actionable intelligence is at the core of today’s successful businesses.

    Data is increasingly more important to everyone’s role. Its value is in helping people do their jobs better and BI reporting provides a complete picture of how your business is performing.

    BI reporting offers one source of the truth

    Companies often have data stored in multiple sources such as ERP, CRM and third party. Traditionally, data must be combined manually into a single source, typically a spreadsheet.  While spreadsheets have their uses, they are notoriously error-prone and not a good option for reporting. A mistake in a single cell will invalidate the entire report. Additionally, multiple managers will often share a spreadsheet. However, when multiple versions of the same document are created, it’s nearly impossible to guarantee that everyone is using the most current version.

    On the other hand, BI reporting integrates company data from multiple sources, so users always have access to one source of the truth. By consolidating disparate data into one discrete repository, data cannot be accidentally deleted or altered. Also, data is displayed on a BI dashboard in real-time so everyone works from the most current information.  

    BI reporting is on demand 

    As many executives know, traditional reporting is slow, rigid, and becomes outdated quickly.  Long, and often frustrating, wait times for IT generated reports are all too familiar experiences. Yet, executives and managers must rely on weekly, monthly, and annual reports to make critical business decisions. This can lead to missed opportunities.

    In contrast, BI reporting enables everyone to access data, conduct analysis, and create personalized reports without IT involvement. Self-service eliminates the wait time for IT reports. Instead, users can slice and dice the most current data whenever they need real-time, actionable insights. Also, standard reports can be generated on a designated schedule. For instance, reports can be set to generate on Monday mornings in anticipation of weekly staff meetings. If more information is needed during the meeting, a customized report can be created on the spot with just a few clicks.  

    Finally, free of the continuous demand for reports, the IT department has more time to focus their attention on other important tasks such as maintaining security or managing data resources. And, the IT department can apply BI reporting to develop strategies to grow the business and increase profitability. 

    BI reporting gives granular insight

    Traditional reports are static, only providing a summary of information without much detail. This means you cannot investigate which underlying factors are driving what you are observing.  Furthermore, static reports only provide the information you request. Since you can’t probe information you don't know is there, you are only getting half the story. A partial picture can lead to a wealth of missed opportunities.

    Conversely, BI reporting is dynamic allowing users to select a metric and drill down into the underlying data. In this way, users are empowered to ask questions of the data and follow their train of thought to discover the answers. For instance, overall sales figures may be on target. However, drilling down will display sales figures by region, product line, and type. This detailed analysis might reveal the one product is over-performing, and that this is masking the declining sales of another product. With this level of granular insight, the sales team can work to boost the sales of the underperforming product to increase sales revenue overall.

    BI reporting offers data visualizations

    BI reporting presents data in the form of visualizations to help clarity complex information. A graphical depiction of numbers makes the information easier to digest, retain, and recall. Visualizations might be simple bar charts, pie charts, and maps. Or they might be more complex models such as waterfalls, funnels, gauges, depending on your needs. In either case, your team will be able to see all factors that are affecting performance.  Visualization makes it easier to identify patterns, trends, and new opportunities. They offer the ability to see changes in customer behavior so your team can respond in ways that drive sales and enable you to stay ahead of the competition.

    BI reporting for month-end statements

    Finance team using a BI tool to report on month's close can review and analyze financial statements directly. A BI tool with financial statement capability makes month-end financial statements more accessible and allows more people to understand the impact of operational decisions on financial performance faster.

    By adding financial statements to business intelligence software brings active analysis, data drill down and dashboarding to the finance and management team with fully controlled user-permission.

    Financial statements are created in the same tradition as the accounting team recognizes, but the process is automated for each statement. The finance team can quickly build financial statements customized to users’ access, so branch managers can see information relevant to their branch, and management can see information across the whole business. The statements can also sit across one or many ERPs so leaders can view the individual company, branch, regional performance and even the consolidated performance when required.

    Now that preparing the financial statements is faster and simpler, the finance team has time to carry out in-depth analysis of the numbers. By preparing financial statements within a data analytics environment, you can quickly compare statements from one traditional period or outside of these timeframes - say one week to the next.

    Transitioning from traditional reporting to BI reporting will provide the ability to see the whole truth, make better decisions faster and uncover new business opportunities.

    Source: Phocas Software

  • Why flexibility is key to keep up with developments in BI software

    Why flexibility is key to keep up with developments in BI software

    In modern business, the proliferation of enterprise digital applications—analytics included—is accelerating their ongoing journeys toward cloud infrastructure adoption. The change is driven by companies’ growing need for a greater number and variety of users to access those applications. But repeatedly introducing new applications to meet business and user needs may not be the best bet.

    On a given day, a person will use more than 20 software applications—be they cloud-based, enterprise, or desktop—O’Reilly reports. Although analytics is increasingly critical to decision-making at all levels of the organization, business leaders risk further dividing users’ attention with new analytics applications.

    Still, employees expect all the capabilities they need at their fingertips. If senior leaders ask them to make a decision as part of their responsibilities, those same leaders must provide a streamlined way for those employees to access the insights they need to make that decision a good one.

    Analytics designed with this flexibility in mind makes this possible. These applications integrate analytics access into existing employee applications and workflows. As Forrester describes, “In the future, BI [i.e., analytics] will enable business users to turn insights into actions without having to leave whatever business or productivity application they have open.” Adaptive analytics environments of this kind allow classic analytics experiences to blend seamlessly with relevant tasks and processes, enhancing corresponding decision making for their users.

    The Problems With Existing Data Ecosystems

    Although most workers still lack access to powerful analytics, many more struggle to leverage analytics within their existing, business-critical workflows. Instead, organizations’ existing data ecosystems are separated within different environments, where individual teams each have their own silos to which they have grown accustomed. This makes it difficult to standardize data access, let alone have individual team members prioritize and use approved analytics resources.

    In lieu of data literacy training and self-service analytics capabilities, workers have had to rely on IT or data wranglers to retrieve the right data for them. Requesting data in this way is often a tedious, drawn-out process that many workers simply avoid. “If it takes four months to get data to support a decision, then the opportunity is lost,” Forbes describes in their article on 2021 analytics trends. “For the business to drive critical outcomes and opportunities quickly, data needs to be available quickly.”

    Make Analytics—Not Users—Adapt

    The future of successful analytics is an adaptive environment that can adjust to a constantly evolving and improving business decision lifecycle. Adaptive analytics of this kind is platform-agnostic and scalable; it can be deployed in any scenario and across on-premises, cloud, or hybrid environments. 

    Since adaptive analytics is cloud-based and flexible, integrating and evolving with a wide variety of digital tools, it future-proofs organizations from missing these opportunities. And while many companies cannot give up on on-premises data sources, advanced analytics of this kind allow them to harness the power of their legacy data stores and provide a bridge from the cloud.

    Most importantly, employees won’t have to change their best practices and existing workflows to leverage the latest, greatest analytics capabilities as they arrive. With embedded analytics, users can “‘make analytics calls’ [to databases] on-demand on a massively big scale and uncover previously hidden patterns and correlations,” as Forbes explains in another article. This solves the core problem many business leaders overlook: Workers don’t want new digital tools, per se; they want answers to their questions and support for their everyday responsibilities. 

    Hyperconvergence Is Key

    Simply put, business leaders must remove the barrier between business users and analytics accessibility—not by swinging open the doors to analytics tools but by strategically integrating analytics into business users’ existing workflows through hyperconvergence:

    “Hyperconverged data analytics is still big data analytics, but it is highly scalable, increasingly intelligent data analytics that has been unified with other core data tools and data functions, while it is also dovetailed with other business tools and business functions.”

    Forbes, “How Data Analytics Became Hyperconverged,” May 27, 2021

    Instead of forcing business users to engage in the tedious process of requesting data from data scientists, those data scientists can connect analytics resources to other business-critical application libraries, APIs, or workflows. In this way, data teams can “operationalize” analytics for business users without forcing them to learn entirely new applications. Analytics finds its place at each company’s operational center, within the same applications workers use every day. 

    As advanced analytics tools evolve, data scientists can enhance capabilities within those business-critical applications as well. For example, natural language processing (NLP) within those applications allows users to seamlessly retrieve insights from data—without technical analytics knowledge and without leaving their preferred application environment. Analytics may use AI to anticipate user behaviors within those application environments as well and then make recommendations based on any variety of available data.

    Start Turning Analytics Into People Power

    According to Forbes’ aforementioned article on 2021 analytics trends, “[2021] will be the year we see the influence of the business user have a major impact on data and analytics, driven by the need to adapt quickly to the next round of changes caused by the pandemic and the economy.” Indeed, flexibility will be critical to the longevity of analytics investments, the ubiquity of adoption, the success of business decision-making, and the realization of ROI.

    Flexibility in analytics also means the technology adapts to—and drives value within—the cultural norms and decision workflows within the organization, helping employees to improve rather than dramatically change how they work, collaborate, and improve. It’s the organizations that prioritize the data needs of their business users who will be most successful in the years to come.

    Author: Omri Kohl

    Source: Pyramid Analytics

  • Why it is key for leaders to infuse data and analytics into organizations

    Why it is key for leaders to infuse data and analytics into organizations

    Data and analytics are proving more important to companies’ strategies than ever before, according to a survey by Harvard Business Review Analytic Services. However, many organizations still fall short of achieving their analytics goals, owing to a skills gap and issues with data access and usage. That’s because almost 75% of organizations don’t have a leadership that completely backs a data-driven culture or nurtures and supports analytics-led innovation. Alarmingly, 24% of respondents say their company culture tends to limit access to information, and 20% think that organizational structure impedes use of analyzed data.

    That figure just hints at part of the problem: 19% of those surveyed blame a lack of leadership prioritization, and 11% say that the failure of previous data projects has led to disillusion and disinvestment in data and analytics. The end result of these combined issues is that 74% of companies experience leadership or cultural reluctance to use data and analytics to their fullest.

    Senior executives are failing to lead by example and embrace data and analytics. In turn, their teams have failed to adopt data-led approaches to their work, neglecting intelligence that could provide beneficial strategic insights and missing opportunities to drive growth, increase revenue, and evolve their businesses.

    We asked some of our experts what leaders should do to put this right. Read on to find out why execs of all kinds must be data evangelists.

    Execs driving evolution: Infusing data across the entire organization

    To maximize the benefits of data and analytics for any organization, our experts agree that business leaders must foster a cultural shift in their companies. Thinking differently and encouraging new habits throughout a business starts at the top.

    Achieving this requires them to develop a vision around being an intelligence-led company. Execs should support the education of colleagues and advocate an organizational culture that adopts the use of analytics as more than just a best practice. They need to adopt technological solutions that infuse analytics into every process, product, and service.

    “First and foremost, it’s an issue of C-suite leadership,” observes Guy Levy-Yurista, former Sisense Chief Strategy Officer. He explains: “Typically, they don’t concern themselves with data and analytics. It’s something they prefer to outsource to data specialists. This has to change if they want their businesses to survive. For an organization to become data-driven, the culture needs to change, and that change must be led by those at the top. The C-suite must embrace, and be seen to embrace, data and analytics. When the top leads, the rest will follow.”

    Envisioning a data-focused company

    Guy calls for companies to build a two- or three-year cohesive strategy that inculcates the use of data and analytics throughout the organization. He says, “Every company must have an embedded data strategy that takes into account the working practices and all of the data needs of every division.”

    This involves taking a fresh approach to data in order to get better results. 

    “Data-driven culture doesn’t mean ‘bring charts to meetings’ or ‘make decisions with numbers,’” explains Scott Castle, Sisense’s VP and General Manager, Internal Analytics. “It means implementing a hypothesis-driven culture. Identify theories, test them, rigorously seek to disprove them while rapidly implementing those that show promise. Make decisions with evidence. Don’t let your team search out favorable statistics. Encourage them to look at the complete data picture and come to conclusions based on the preponderance of the evidence.”

    To this end, Charles Holive, Managing Director of Sisense’s Strategy Consulting Business, calls for the appointment of a chief data officer in every leadership team to be the main advocate for data-driven working practices, and he says they should have revenue targets. He concludes: “This is not an initiative. It’s a way forward, a mandatory muscle for all companies to develop by infusing analytics in everything they do internally and externally, to overall increase returns on investments for their companies and their customers.”

    Analytics success stories: Companies infusing analytics to win their industries

    Smarte Carte has done an excellent job of bringing all of its data together and putting it into the hands of its field team, so everyone is working with near real-time data from their mobile devices. This helps ensure better forecasting, reduces product/kiosk downtime, and ensures that its people have the answers they need when and where they need them.

    Another huge example of a company leading with data is Amazon: The Seattle tech giant is an extremely data-driven company, with customer service at its core. Amazon measures the effectiveness of almost everything it does, including innovation. Guy observes that Amazon gives employees license to innovate almost at will. (Indeed, it’s an unofficial motto around the “Everything Store” that employees should innovate their own jobs away.)

    It’s becoming increasingly important to try to measure innovation with data. Providing it can, and providing the numbers show a benefit, the innovation becomes regular operating procedure. With this in mind, Guy recommends embracing innovation as a critical driver of success. 

    “Innovation can be inexact and inefficient, often by design,” says Guy. “So, any company needs to create a team and allocate a budget that’s dedicated to innovation and that has the latitude to examine data further, stretch parameters, and explore whether there are new possibilities out there.”

    Pressure drives evolution: Companies transforming under COVID-19

    The coronavirus pandemic forced companies to think differently and improve their agility. Innovation became critical for nearly every business. Scott describes it as “a perfect example of a sudden market change that required every business to reconsider its fundamental assumptions.”

    Some companies, like those in air travel and hospitality, saw demand for their products almost completely disappear overnight, and others, like Zoom and grocery stores, saw it scale unexpectedly. They all required quick responses to adjust to, capitalize upon, or even simply survive in the new reality. Recall the supply chain problems that paralyzed supermarkets in April 2020: Organizations faced with totally new market dynamics needed to test new hypotheses and run experiments quickly — and those that did, using data and analytics, survived and thrived. 

    The pandemic brought a new focus to data and its timeliness because conditions could change daily; a dashboard being updated once a week or month was no longer acceptable. The customer experience changed as well, almost overnight, and will only continue to evolve.

    Taking a macro view, Charles observes that, “Many markets got to be reset through the pandemic … giving an opportunity to large, small, and new players to reinvent themselves and tackle the market from a redefined environment. It’s been surprising to me to see how fast companies, doing it through the value and data-driven approach, went on to win more.”

    Lead with change, or change leaders

    The consensus is clear — organizations can’t stand still. To flourish, they must be led using actionable intelligence, derived from data and analytics. They must infuse analytics into their practices, their products, their services, and even alter their organization’s DNA if necessary, to become modern businesses.

    To do that, they need their leaders to become evangelists for analytics. They must expedite the infusion of analytics everywhere and enable everyone to use actionable intelligence. The choice is stark for the leaders of every business: Do this for your organization to survive and thrive, or die.

    Author: Adam Murray

    Source: Sisense

  • Why the right data input is key: A Machine Learning example

    Why the right data input is key: A Machine Learning example

    Finding the ‘sweet spot’ of data needs and consumption is critical to a business. Without enough, the business model under performs. Too much and you run the risk of compromised security and protection. Measuring what data intake is needed, like a balanced diet, is key to optimum performance and output. A healthy diet of data will set a company on the road to maximum results without drifting into red areas either side. 

    Machine learning is not black magic. A simple definition is the application of learning algorithms to data to uncover useful aspects of the input. There are clearly two parts to this process, though: the algorithms themselves and the data being processed and fed in.

    The algorithms are vital, and continually tuning and improving them makes significant difference to the success of the solutions. However, these are just mathematical experiments on the data. The pivotal bit is the data itself. Quite simply, the algorithms cannot work well on poor data volume, and a deficit of data leaves the system undernourished and, ultimately, the system hungering for more. With more data to consume, the system can be trained more fully and the outcomes are stronger.

    Without question, there is a big need for an ample amount of data to offer the system a healthy helping to configure the best outcomes. What is crucial, though, is that the data collected is representative of the tasks you intend to perform.

    Within speech recognition, for example, this means that you might be interested in any or all of the following attributes:


    • formal speech/informal speech
    • prepared speech/unprepared speech
    • trained speakers/untrained speakers
    • presenter/conversational
    • general speech/specific speech
    • accents/dialects


    • noisy/quiet
    • professional recording/amateur recording
    • broadcast/telephony
    • controlled/uncontrolled

    In reality, all of these attributes impact the ability to perform the tasks required of speech recognition with ultimate accuracy. Therefore, the data needed to tick all the boxes is different and involves varying degrees of difficulty to obtain. Bear in mind that it is not just the audio that is needed, accurate transcripts are required to perform training. That probably means that most data will need to be listened to by humans to transcribe or validate the data, and that can create an issue of security.

    An automatic speech recognition (ASR) system operates in two modes: training and operating.


    Training is most likely managed by the AI/ML company providing the service, which means the company needs access to large amounts of relevant data. In some cases, this is readily available in the public domain anyway. For example, content that has already been broadcast on television or radio and therefore has no associated privacy issues. But this sort of content cannot help with many of the other scenarios in which ASR technology can be used, such as phone call transcription, which has many different translation characteristics. Obtaining this sort of data can be tied up with contracts for data ownership, privacy and usage restrictions.


    In operational use, there is no need to collect audio. You just use the models that have previously been trained. But the obvious temptation is to capture the operational data and use it. However, as mentioned, this is where the challenge begins: ownership of the data. Many cloud solution providers want to use the data openly, as it will enable continuous improvement for the required use cases. Data ownership becomes the lynchpin.

    The challenge is to be able to build great models that work really well in any scenario without capturing privately-owned data. A balance between quality and security must be struck. This trade-off happens in many computer systems but somehow data involving people’s voices often, understandably, generates a great deal of concern.

    Finding a solution

    To ultimately satiate an ASR system, there needs to be just enough data provided to execute the training so good systems can be built. There is an option for companies to train their own models, which enables them to maintain ownership of the data. This can often require a complex professional services agreement, requiring a good investment of time, but it can provide a solution at a reasonable cost very quickly.

    ML algorithms are in a constant state of evolution, and techniques can now be used that allow smaller data sets to be used to bias systems already trained on big data. In some cases, smaller amounts of data can achieve ‘good enough’ accuracy. The overall issue of data acquisition is not removed, but sometimes less data can provide solutions.

    Finding a balanced data diet by enabling better algorithm tuning, and filtering and selection of data, can get the best results without collecting everything that has ever been said. More effort may be needed to achieve the best equilibrium. And, without doubt, the industry must maintain its search for ways to make the technology work better without people’s privacy being compromised.

    Author: Ian Firth

    Source: Insidebigdata

  • Why you need a data fabric next to an IT architecture to optimize BI  

    Why you need a Data Fabric next to an IT Architecture to optimize BI

    Data fabrics offer an opportunity to track, monitor and utilize data, while IT architectures track, monitor and maintain IT assets. Both are needed for a long-term digitalization strategy.

    As companies move into hybrid computing, they’re redefining their IT architectures. IT architecture describes a company's entire IT asset base, whether on-premises or in-cloud. This architecture is stratified into three basic levels: hardware such as mainframes, servers, etc.; middleware, which encompasses operating systems, transaction processing engines, and other system software utilities; and the user-facing applications and services that this underlying infrastructure supports.

    IT architecture has been a recent IT focus because as organizations move to the cloud, IT assets also move, and there is a need to track and monitor these shifts.

    However, with the growth of digitalization and analytics, there is also a need to track, monitor, and maximize the use of data that can come from a myriad of sources. An IT architecture can’t provide data management, but a data fabric can. Unfortunately, most organizations lack well-defined data fabrics, and many are still trying to understand why they need a data fabric at all.

    What Is a Data Fabric?

    Gartner defines a data fabric as “a design concept that serves as an integrated layer (fabric) of data and connecting processes. A data fabric utilizes continuous analytics over existing, discoverable and inferenced metadata assets to support the design, deployment and utilization of integrated and reusable data across all environments, including hybrid and multi-cloud platforms.”

    Let’s break it down.

    Every organization wants to use data analytics for business advantage. To use analytics well, you need data agility that enables you to easily connect and combine data from any source your company uses --whether the source is an enterprise legacy database or data that is culled from social media or the Internet of Things (IoT).  You can't achieve data integration and connectivity without using data integration tools, and you also must find a way to connect and relate disparate data to each other in meaningful ways if your analytics are going to work.

    This is where data fabric enters. The data fabric contains all the connections and relationships between an organization’s data, no matter what type of data it is or where it comes from. The goal of the fabric is to function as an overall tapestry of data that interweaves all data so data in its entirety is searchable. This has the potential to not only optimize data value, but to create a data environment that can answer virtually any analytics query. The data fabric does what an IT architecture can’t: it tells you what data does, and how data relates to each other. Without a data fabric, companies’ abilities to leverage data and analytics are limited.

    Building a Data Fabric

    When you build a data fabric, it’s best to start small and in a place where your staff already has familiarity.

    That “place” for most companies will be with the tools that they are already using to extract, transform and load (ETL) data from one source to another, along with any other data integration software such as standard and custom APIs. All of these are examples of data integration you have already achieved.

    Now, you want to add more data to your core. You can do this by continuing to use the ETL and other data integration methods you already have in place as you build out your data fabric. In the process, care should be taken to also add the metadata about your data, which will include the origin point for the data, how it was created, what business and operational processes use it, what its form is (e.g.,  single field in a fixed record, or an entire image file), etc. By maintaining the data’s history, as well as all its transformations, you are in a better position to check data for reliability, and to ensure that it is secure. 

    As your data fabric grows, you will probably add data tools that are missing from your workbench. These might be tools that help with tracking data, sharing metadata, applying governance to data, etc. A recommendation in this area is to look for an all-inclusive data management software that contains not only all the tools that you'll need build a data fabric, but also important automation such as built-in machine learning.

    The machine learning observes how data in your data fabric is working together, and which combinations of data are used most often in different business and operational contexts. When you query the data, the ML assists in pulling the data together that is most likely to answer your queries.

    It’s difficult for many organizations to develop data fabric elements like machine learning “from scratch.” This is where data management software helps because it usually includes already automated, built-in machine learning that you can use in your data fabric.


    Data fabrics offer an opportunity to track, monitor and utilize data while IT architectures track, monitor and maintain IT assets. Both are needed for a long-term digitalization strategy.

    The data fabric development can start on a small scale, such as a specific business area or a use case. In most cases, IT can use data integration tools it is already familiar with, together with a data management system that can automate many of the data fabric building functions that IT is less familiar with.

    The end goal should be an IT architecture that tells you where every IT asset is and what it does; and a data fabric that tells you everything you want to know about the data in that infrastructure. 

    Author: Mary E. Shacklett

    Source: InformationWeek

EasyTagCloud v2.8