14 items tagged "Data mining"

  • A Shortcut Guide to Machine Learning and AI in The Enterprise


    Predictive analytics / machine learning / artificial intelligence is a hot topic – what’s it about?

    Using algorithms to help make better decisions has been the “next big thing in analytics” for over 25 years. It has been used in key areas such as fraud the entire time. But it’s now become a full-throated mainstream business meme that features in every enterprise software keynote — although the industry is battling with what to call it.

    It appears that terms like Data Mining, Predictive Analytics, and Advanced Analytics are considered too geeky or old for industry marketers and headline writers. The term Cognitive Computing seemed to be poised to win, but IBM’s strong association with the term may have backfired — journalists and analysts want to use language that is independent of any particular company. Currently, the growing consensus seems to be to use Machine Learning when talking about the technology and Artificial Intelligence when talking about the business uses.

    Whatever we call it, it’s generally proposed in two different forms: either as an extension to existing platforms for data analysts; or as new embedded functionality in diverse business applications such as sales lead scoring, marketing optimization, sorting HR resumes, or financial invoice matching.

    Why is it taking off now, and what’s changing?

    Artificial intelligence is now taking off because there’s a lot more data available and affordable, powerful systems to crunch through it all. It’s also much easier to get access to powerful algorithm-based software in the form of open-source products or embedded as a service in enterprise platforms.

    Organizations today have also more comfortable with manipulating business data, with a new generation of business analysts aspiring to become “citizen data scientists.” Enterprises can take their traditional analytics to the next level using these new tools.

    However, we’re now at the “Peak of Inflated Expectations” for these technologies according to Gartner’s Hype Cycle — we will soon see articles pushing back on the more exaggerated claims. Over the next few years, we will find out the limitations of these technologies even as they start bringing real-world benefits.

    What are the longer-term implications?

    First, easier-to-use predictive analytics engines are blurring the gap between “everyday analytics” and the data science team. A “factory” approach to creating, deploying, and maintaining predictive models means data scientists can have greater impact. And sophisticated business users can now access some the power of these algorithms without having to become data scientists themselves.

    Second, every business application will include some predictive functionality, automating any areas where there are “repeatable decisions.” It is hard to think of a business process that could not be improved in this way, with big implications in terms of both efficiency and white-collar employment.

    Third, applications will use these algorithms on themselves to create “self-improving” platforms that get easier to use and more powerful over time (akin to how each new semi-autonomous-driving Tesla car can learn something new and pass it onto the rest of the fleet).

    Fourth, over time, business processes, applications, and workflows may have to be rethought. If algorithms are available as a core part of business platforms, we can provide people with new paths through typical business questions such as “What’s happening now? What do I need to know? What do you recommend? What should I always do? What can I expect to happen? What can I avoid? What do I need to do right now?”

    Fifth, implementing all the above will involve deep and worrying moral questions in terms of data privacy and allowing algorithms to make decisions that affect people and society. There will undoubtedly be many scandals and missteps before the right rules and practices are in place.

    What first steps should companies be taking in this area?
    As usual, the barriers to business benefit are more likely to be cultural than technical.

    Above all, organizations need to make sure they have the right technical expertise to be able to navigate the confusion of new vendors offers, the right business knowledge to know where best to apply them, and the awareness that their technology choices may have unforeseen moral implications.

    Source: timoelliot.com, October 24, 2016


  • Becoming a BI Analyst: What Does It Entail?

    Becoming a BI Analyst: What Does It Entail?

    As with many buzzwords emerging from the intersection of business and technology, the phrase business intelligence (BI) is often misunderstood. In a nutshell, it refers to the skill and practice of extracting insights from data to realize new goals, strategies, trends, and values. A business intelligence analyst, working with a network of other knowledge workers (such as data stewards and data governance specialists), helps an enterprise thrive.  

    Business Intelligence Explained

    Business intelligence refers to the perspectives gained from analyzing the business information that companies hold. Since that data may be spread across many locations and departments, business intelligence is an amalgam of analytics and mining that can empower management with the tools needed to make informed decisions that may not otherwise be apparent.  

    Today’s data-driven businesses are growing at an unprecedented pace, often along unpredictable paths. Because of this, you might think that business intelligence should largely be an automated affair – even the domain of AI. However, algorithms and automation alone cannot harness the creative connections and nuanced insights required within the field. Although IT is obviously a major part of the equation, business intelligence requires human intelligence.  

    Curious about what it takes to become a business intelligence analyst? Read on for the skills and education you’ll need and the responsibilities you’ll have if you follow this career path.

    What Is a Business Intelligence Analyst?

    As is common among data-centric professions, a business intelligence analyst (BIA) must wear many hats and have skills that fall across various areas. Still, the core of the job boils down to creating regular reports that summarize a company’s current data holdings in relation to parallel financial reports and current market intelligence. 

    Typically, these reports cogently present salient trends in an identified market that could impact the goals and actionable items on a company’s agenda, plotted as a function of the various data assets at the organization’s disposal.  

    Although a business intelligence analyst is much more than a glorified office assistant, the job is best understood as a support role for executive decision-makers. A BIA must provide meticulously supported analytical insights that reflect the current realities of both the enterprise and markets in question. At the end of the day, the key outcomes of the analyst’s work are to bolster the company’s place in the market, streamline the efficiency of the staff, amplify overall productivity, and even upgrade performance at the level of customer experience.   

    The business intelligence analyst is a relatively new vocation but growing fast: Forbes recently tapped the BIA as one of the most sought-after positions in the greater STEM marketplace.  

    Since there’s a demand for BI expertise across so many industries – healthcare and medicine, insurance, finance, e-commerce – professionals working in the U.S. can expect to command a salary of roughly $80,000 per year (with even higher figures in especially tech-heavy states).

    What Skills Do Business Intelligence Analysts Need?

    Just as one would expect from the job title, the lion’s share of a business intelligence analyst’s skill set involves crunching data. They need to have a strong command of data at every level, including organization, storage, mining big data, and analysis – all with a keen and responsive eye for spotting key performance indicators and business-critical priorities in a company’s data troves.  

    Beyond data, a top-tier BIA will have some proficiency in tools tailored specifically for BI, programming languages, and systems analysis.  

    Data and tech know-how may anchor the position, but it’s nothing without a raft of communications skills to translate data insights into actionable steps. This entails critical thinking and the ability to make presentations that speak to the needs of stakeholders in easy-to-understand language and data visualizations.

    Typical skills required for business intelligence analysts:

    • Extensive knowledge of software in user interface, database management, enterprise resource management (proficiency in Python, R, C#, Hadoop, and SQL)
    • Presentation and reporting in a timely and cogent manner (mastery of PowerPoint and business functions of Zoom are obvious assets)
    • Upper-level background in integrating software and programs into multiple tiers of data services
    • A knack for problem-solving in both technical and interpersonal contexts; at least five years of engagement in analytical and critical thinking skills in a professional setting
    • Ability to build rapport with both individuals in management and interdepartmental teams (especially in cases of implementing new software and tech that may result from BI recommendations)

    BI Roles and Responsibilities 

    As much as business intelligence can be about interpersonal action, much of an analyst’s duties are solitary ones, chief among these authoring procedures for data processing and collection. From there on, expect reporting and more reporting, including analytical reports that can be personalized for the needs of stakeholders, highlighting the most departmentally relevant findings.  

    A business intelligence analyst also needs to maintain an active role in the various life cycles of data as it moves throughout the organization. After all, data reports are built upon regularly monitoring the way data is collected, looking at field reports, product summaries from third parties, and even through public record.  

    As a function of this, a BIA may want to continually track burgeoning trends in tech or emerging markets that could potentially offer efficiency or value within the industry and their specific enterprise.  

    Working in concert with specialists in data governance and stewardship, a BIA must oversee the integrity, security, and location of data storage. This should be performed in the organization’s computer database and may be done in conjunction with new operational protocols that make the most of the database as it evolves in tandem with updates and unique program features. Finally, BIAs benefit from taking a step back for meta-analysis, forging new methodologies that improve analysis at every step outlined above.

    Required Education and Training 

    There are several routes you may follow to prepare for a career in business intelligence. Most obviously, you can earn a bachelor’s degree directly in business intelligence, which incorporates a study of analytics with elements of marketing, tech, and management. Alternatively, a beginner in the field may want to proceed more obliquely, garnering a B.A. in a related field, such as computer science, accounting, finance, management, or business. A bachelor’s is enough to open the door for most entry-level positions in business intelligence, but a master’s in a more comprehensive discipline such as business analytics can make the difference in landing more competitive, elite jobs.  

    Date: September 20, 2023

    Author: Shauna Frenté

    Source: Dataversity 

  • Big Data gaat onze zorg verbeteren

    Hij is een man met een missie. En geen geringe: hij wil samen met patiënten, de zorgverleners en verzekeraars een omslag in de gezondheidszorg bewerkstelligen, waarbij de focus verlegd wordt van het managen van ziekte naar het managen van gezondheid. Jeroen Tas, CEO Philips Connected Care & Health Informatics, over de toekomst van de zorg.

    big-data-healthcare-2Wat is er mis met het huidige systeem?

    “In de ontwikkelde wereld wordt gemiddeld 80 procent van het budget voor zorg besteed aan het behandelen van chronische ziektes, zoals hart- en vaatziektes, longziektes, diabetes en verschillende vormen van kanker. Slechts 3 procent van dat budget wordt besteed aan preventie, aan het voorkomen van die ziektes. Terwijl we weten dat 80 procent van hart- en vaatziekten, 90 procent van diabetes type 2 en 50 procent van kanker te voorkomen zijn. Daarbij spelen sociaaleconomische factoren mee, maar ook voeding, wel of niet roken en drinken, hoeveel beweging je dagelijks krijgt en of je medicatie goed gebruikt. We sturen dus met het huidige systeem lang niet altijd op op de juiste drivers om de gezondheid van mensen te bevorderen en hun leven daarmee beter te maken. 50 procent van de patiënten neemt hun medicatie niet of niet op tijd in. Daar liggen mogelijkheden voor verbetering.”

    Dat systeem bestaat al jaren - waarom is het juist nu een probleem?
    “De redenen zijn denk ik alom bekend. In veel landen, waaronder Nederland, vergrijst de bevolking en neemt daarmee het aantal chronisch zieken toe, en dus ook de druk op de zorg. Daarbij verandert ook de houding van de burger ten aanzien van zorg: beter toegankelijk, geïntegreerd en 24/7, dat zijn de grote wensen. Tot slot nemen de technologische mogelijkheden sterk toe. Mensen kunnen en willen steeds vaker zelf actieve rol spelen in hun gezondheid: zelfmeting, persoonlijke informatie en terugkoppeling over voortgang. Met Big Data zijn we nu voor het eerst in staat om grote hoeveelheden data snel te analyseren, om daarin patronen te ontdekken en meer te weten te komen over ziektes voorspellen en voorkomen. Kortom, we leven in een tijd waarin er binnen korte tijd heel veel kan en gaat veranderen. Dan is het belangrijk om op de juiste koers te sturen.”

    Wat moet er volgens jou veranderen?
    “De zorg is nog steeds ingericht rond (acute) gebeurtenissen. Gezondheid is echter een continu proces en begint met gezond leven en preventie. Als mensen toch ziek worden, volgt er diagnose en behandeling. Vervolgens worden mensen beter, maar hebben ze misschien nog wel thuis ondersteuning nodig. En hoop je dat ze weer verder gaan met gezond leven. Als verslechtering optreedt is tijdige interventie wenselijk. De focus van ons huidige systeem ligt vrijwel volledig op diagnose en behandeling. Daarop is ook het vergoedingssysteem gericht: een radioloog wordt niet afgerekend op zijn bijdrage aan de behandeling van een patiënt maar op de hoeveelheid beelden die hij maakt en beoordeelt. Terwijl we weten dat er heel veel winst in termen van tijd, welzijn en geld te behalen valt als we juist meer op gezond leven en preventie focussen. 

    Er moeten ook veel meer verbanden komen tussen de verschillende pijlers in het systeem en terugkoppeling over de effectiviteit van diagnose en behandeling. Dat kan bijvoorbeeld door het delen van informatie te stimuleren. Als een cardioloog meer gegevens heeft over de thuissituatie van een patiënt, bijvoorbeeld over hoe hij zijn medicatie inneemt, eet en beweegt, dan kan hij een veel beter behandelplan opstellen, toegesneden op de specifieke situatie van de patiënt. Als de thuiszorg na behandeling van die patiënt ook de beschikking heeft over zijn data, weet men waarop er extra gelet moet worden voor optimaal herstel. En last maar zeker not least, de patiënt moet ook over die data beschikken, om zo gezond mogelijk te blijven. Zo ontstaat een patiëntgericht systeem gericht op een optimale gezondheid.”

    Dat klinkt heel logisch. Waarom gebeurt het dan nog niet?
    “Alle verandering is lastig – en zeker verandering in een sector als de zorg, die om begrijpelijke redenen conservatief is en waarin er complexe processen spelen. Het is geen kwestie van technologie: alle technologie die we nodig hebben om de omslag tot stand te brengen, is er. We hebben sensoren om data automatisch te generen, die in de omgeving van de patiënt kunnen worden geïnstalleerd, die hij kan dragen – denk aan een Smarthorloge – en die zelfs in zijn lichaam kunnen zitten, in het geval van slimme geneesmiddelen. Daarmee komt de mens centraal te staan in het systeem, en dat is waar we naartoe willen.
    Er moet een zorgnetwork om ieder persoon komen, waarin onderling data wordt gedeeld ten behoeve van de persoonlijke gezondheid. Dankzij de technologie kunnen veel behandelingen ook op afstand gebeuren, via eHealth oplossingen. Dat is veelal sneller en vooral efficiënter dan mensen standaard doorsturen naar het ziekenhuis. Denk aan thuismonitoring, een draagbaar echo apparaat bij de huisarts of beeldbellen met een zorgverlener. We kunnen overigens al hartslag, ademhaling en SPo2 meten van een videobeeld. 

    De technologie is er. We moeten het alleen nog combineren, integreren en vooral: implementeren. Implementatie hangt af van de bereidheid van alle betrokkenen om het juiste vergoedingsstelsel en samenwerkingsverband te vinden: overheid, zorgverzekeraars, ziekenhuis, artsen, zorgverleners en de patiënt zelf. Daarover ben ik overigens wel positief gestemd: ik zie de houding langzaam maar zeker veranderen. Er is steeds meer bereidheid om te veranderen.”

    Is die bereidheid de enige beperkende factor?
    “We moeten ook een aantal zaken regelen op het gebied van data. Data moet zonder belemmeringen kunnen worden uitgewisseld, zodat alle gegevens van een patiënt altijd en overal beschikbaar zijn. Dat betekent uiteraard ook dat we ervoor moeten zorgen dat die gegevens goed beveiligd zijn. We moeten ervoor zorgen dat we dat blijvend kunnen garanderen. En tot slot moeten we werken aan het vertrouwen dat nodig is om gegevens te standaardiseren en te delen, bij zorgverleners en vooral bij de patiënt.Dat klinkt heel zwaar en ingewikkeld maar we hebben het eerder gedaan. Als iemand je twintig jaar geleden had verteld dat je via internet al je bankzaken zou regelen, zou je hem voor gek hebben versleten: veel te onveilig. Inmiddels doen we vrijwel niet anders.
    De shift in de zorg nu vraagt net als de shift in de financiële wereld toen om een andere mindset. De urgentie is er, de technologie is er, de bereidheid ook steeds meer – daarom zie ik de toekomst van de zorg heel positief in.”

     Bron: NRC
  • Business Data Scientist leergang nu ook in België


    De Radboud Management Academy heeft haar in Nederland zo succesvolle Business Data Scientist leergang nu ook in Belgie op de markt gebracht. In samenwerking met Business & Decision werd afgelopen week in het kantoor van Axa in Brussel een verkorte leergang gegeven aan mensen uit het Belgische bedrijfsleven. Ook vertegenwoordigers van ministeries en andere overheidsinstellingen waren vertegenwoordigd.BDS

    De opleiding speelt in op de behoefte van bedrijven meer waarde te halen uit de bij hen beschikbare data. Daarbij richt de opleiding zich niet alleen op de ontwikkeling van individuele competenties maar ook op organisatiestructuren en instrumenten die organisaties helpen meer datagestuurd te werken.

    Het 3D model dat centraal staat in de leergang wordt door cursisten als een belangrijke toevoeging gezien op de technische competenties die men vaak reeds bezit. Meer en meer wordt onderkend dat eigenschappen die de interfacing met de ‘business’ kunnen verbeteren uiteindelijk bepalend zijn voor het tot waarde brengen van de inzichten die uiteindelijk met data kunnen worden gegenereerd. De toolbox van de Data Scientist wordt in de leergang op een significante wijze uitgebreid met zowel functionele, sociale als technische eigenschappen.

    Meer weten? Ga naar http://www.ru.nl/rma/leergangen/bds/

  • Business Intelligence in 3PL: Mining the Value of Data

    data-mining-techniques-create-business-value 1In today’s business world, “information” is a renewable resource and virtually a product in itself. Business intelligence technology enables businesses to capture historical, current and predictive views of their operations, incorporating such functions as reporting, real-time analytics, data and process mining, performance management, predictive analytics, and more. Thus, information in its various forms and locations possesses genuine inherent value.
    In the real world of warehousing, the availability of detailed, up-to-the minute information on virtually every item in the operators’ custody, from inbound dock to delivery site, leads to greater efficiency in every area it touches. Logic would offer that greater profitability ensues.
    Three areas of 3PL operations seem to be most benefitted through savings opportunities identified through business intelligence solutions: labor, inventory, and analytics.
    In the first case, business intelligence tools can help determine the best use of the workforce, monitoring its activity in order to assure maximum effective deployment. The result: potentially major jumps in efficiency, dramatic reductions in downtime, and healthy increases in productivity and billable labor.
    In terms of inventory management, the metrics obtainable through business intelligence can stem inventory inaccuracies that would have resulted in thousands of dollars in annual losses, while also reducing write-offs.
    Analytics through business intelligence tools can also accelerate the availability of information, as well as provide the optimal means of presentation relative to the type of user. One such example is the tracking of real-time status of work load by room or warehouse areas; supervisors can leverage real-time data to re-assign resources to where they are needed in order to balance workloads and meet shipping times. A well-conceived business intelligence tool can locate and report on a single item within seconds and a couple of clicks.
    Extending the Value
    The value of business intelligence tools is definitely not confined to the product storage areas.
    With automatically analyzed information available in a dashboard presentation, users – whether in the office or on the warehouse floor – can view the results of their queries/searches in a variety of selectable formats, choosing the presentation based on its usefulness for a given purpose. Examples:
    • Status checks can help identify operational choke points, such as if/when/where an order has been held up too long; if carrier wait-times are too long; and/or if certain employees have been inactive for too long.
    • Order fulfillment dashboards can monitor orders as they progress through the picking, staging and loading processes, while also identifying problem areas in case of stalled processes.
    • Supervisors walking the floor with handheld devices can both encourage team performance and, at the same time, help assure efficient dock-side activity. Office and operations management are able to monitor key metrics in real-time, as well as track budget projections against actual performance data.
    • Customer service personnel can call up business intelligence information to assure that service levels are being maintained or, if not, institute measures to restore them.
    • And beyond the warehouse walls, sales representatives in the field can access mined and interpreted data via mobile devices in order to provide their customers with detailed information on such matters as order fill rates, on-time shipments, sales and order volumes, inventory turnover, and more.
    Thus, well-designed business intelligence tools not only can assemble and process both structured and unstructured information from sources across the logistics enterprise, but can deliver it “intelligently” – that is, optimized for the person(s) consuming it. These might include frontline operators (warehouse and clerical personnel), front line management (supervisors and managers), and executives.
    The Power of Necessity
    Chris Brennan, Director of Innovation at Halls Warehouse Corp., South Plainfield N.J., deals with all of these issues as he helps manage the information environment for the company’s eight facilities. Moreover, as president of the HighJump 3PL User Group, he strives to foster collective industry efforts to cope with the trends and issues of the information age as it applies to warehousing and distribution.
    “Even as little as 25 years ago, business intelligence was a completely different art,” Brennan has noted. “The tools of the trade were essentially networks of relationships through which members kept each other apprised of trends and happenings. Still today, the power of mutual benefit drives information flow, but now the enormous volume of data available to provide intelligence and drive decision making forces the question: Where do I begin?”
    Brennan has taken a leading role in answering his own question, drawing on the experience and insights of peers as well as the support of HighJump’s Enterprise 3PL division to bring Big Data down to size:
    “Business intelligence isn’t just about gathering the data,” he noted, “it’s about getting a group of people with varying levels of background and comfort to understand the data and act upon it. Some managers can glance at a dashboard and glean everything they need to know, but others may recoil at a large amount of data. An ideal BI solution has to relay information to a diverse group of people and present challenges for them to think through.”
    source: logisticviewpoints.com, December 6, 2016
  • From Patterns to Predictions: Harnessing the Potential of Data Mining in Business  

    From Patterns to Predictions: Harnessing the Potential of Data Mining in Business

    Data mining techniques can be applied across various business domains such as operations, finance, sales, marketing, and supply chain management, among others. When executed effectively, data mining provides a trove of valuable information, empowering you to gain a competitive advantage through enhanced strategic decision-making.

    At its core, data mining is a method employed for the analysis of data, delving into large datasets to unearth meaningful and data-driven insights. Key components of successful data mining encompass tasks like data cleaning, data transformation, and data integration.

    Data Cleaning and Preparation

    Data cleaning and preparation stand as crucial stages within the data mining process, playing a pivotal role in ensuring the effectiveness of subsequent analytical methods. The raw data necessitates purification and formatting to render it suitable for diverse analytic approaches. Encompassing elements such as data modeling, transformation, migration, ETL (Extract, Transform, Load), ELT (Extract, Load, Transform), data integration, and aggregation, this phase is indispensable for comprehending the fundamental features and attributes of data, ultimately determining its optimal utilization.

    The business implications of data cleaning and preparation are inherently clear. Without this initial step, data holds either no meaning for an organization or is compromised in terms of reliability due to quality issues. For companies, establishing trust in their data is paramount, ensuring confidence not only in the data itself but also in the analytical outcomes and subsequent actions derived from those results.

    Pattern and Classification

    The essence of data mining lies in the fundamental technique of tracking patterns, a process integral to discerning and monitoring trends within data. This method enables the extraction of intelligent insights into potential business outcomes. For instance, upon identifying a sales trend, organizations gain a foundation for taking strategic actions to leverage this newfound insight. When it’s revealed that a specific product outperforms others within a particular demographic, this knowledge becomes a valuable asset. Organizations can then capitalize on this information by developing similar products or services tailored to the demographic or by optimizing the stocking strategy for the original product to cater to the identified consumer group.

    In the realm of data mining, classification techniques play a pivotal role by scrutinizing the diverse attributes linked to various types of data. By discerning the key characteristics inherent in these data types, organizations gain the ability to systematically categorize or classify related data. This process proves crucial in the identification of sensitive information, such as personally identifiable data, prompting organizations to take measures to protect or redact this information from documents.


    The concept of association in data mining, closely tied to statistics, unveils connections among different sets of data or events within a dataset. This technique highlights the interdependence of specific data points or events, akin to the idea of co-occurrence in machine learning. In this context, the presence of one data-driven event serves as an indicator of the likelihood of another, shedding light on the intricate relationships embedded within the data.

    Outlier Detection

    Outlier detection serves as a critical process in identifying anomalies within datasets. When organizations pinpoint irregularities in their data, it facilitates a deeper understanding of the underlying causes and enables proactive preparation for potential future occurrences, aligning with strategic business objectives. To illustrate, if there’s a notable surge in credit card transactions within specific time frames, organizations can leverage this information to investigate the root cause. Understanding why this surge happens allows them to optimize sales strategies for the remainder of the day, showcasing the practical application of outlier detection in refining business operations.


    Clustering, a pivotal analytics technique, employs visual approaches to comprehend data distributions. Utilizing graphics, clustering mechanisms illustrate how data aligns with various metrics, employing different colors to highlight these distributions. Graphs, particularly in conjunction with clustering, offer a visual representation of data distribution, allowing users to discern trends relevant to their business objectives.


    Regression techniques prove invaluable in identifying the nature of relationships between variables in a dataset. Whether causal or correlational, regression, as a transparent white box technique, elucidates the precise connections between variables. Widely applied in forecasting and data modeling, regression provides a clear understanding of how variables interrelate.


    Prediction stands as a potent facet of data mining, constituting one of the four branches of analytics. Predictive analytics leverage patterns in current or historical data to extrapolate insights into future trends. While some advanced approaches incorporate machine learning and artificial intelligence, predictive analytics can also be facilitated through more straightforward algorithms. This predictive capability offers organizations a foresight into upcoming data trends, irrespective of the complexity of the underlying techniques.

    Sequential Data

    Sequential patterns, a specialized data mining technique, focus on unveiling events occurring in a sequence, which is particularly advantageous for analyzing transactional data. This method can reveal customer preferences, such as the type of clothing they are likely to purchase after acquiring a specific item. Understanding these sequential patterns empowers organizations to make targeted recommendations, thereby stimulating sales. VPN ensures the confidentiality of transactional data, preserving the privacy of customers while deriving valuable insights.

    Decision Trees

    Decision trees, a subset of machine learning, serve as transparent predictive models. They facilitate a clear understanding of how data inputs influence outputs. When combined into a random forest, decision trees form powerful predictive analytics models, albeit more complex. While random forests may be considered black box techniques, the fundamental decision tree structure enhances accuracy, especially when compared to standalone decision tree models.

    Data Mining Analytics

    At the heart of data mining analytics lie statistical techniques, forming the foundation for various analytical models. These models produce numerical outputs tailored to specific business objectives. From neural networks to machine learning, statistical concepts drive these techniques, contributing to the dynamic field of artificial intelligence.

    Data Visualizations

    Data visualizations play a crucial role in data mining, offering users insights based on sensory perceptions. Today’s dynamic visualizations, characterized by vibrant colors, are adept at handling real-time streaming data. Dashboards, built upon different metrics and visualizations, become powerful tools to uncover data mining insights, moving beyond numerical outputs to visually highlight trends and patterns.

    Deep Learning

    Neural networks, a subset of machine learning, draw inspiration from the human brain’s neuron structure. While potent for data mining, their complexity necessitates caution. Despite the intricacy, neural networks stand out as accurate models in contemporary machine learning applications, particularly in AI and deep learning scenarios.

    Data Warehousing

    Data warehousing, a pivotal component of data mining, has evolved beyond traditional relational databases. Modern approaches, including cloud data warehouses and those accommodating semi-structured and unstructured data in platforms like Hadoop, enable comprehensive, real-time data analysis, extending beyond historical data usage.

    Analyzing Insights

    Long-term memory processing involves the analysis of data over extended periods. Utilizing historical data, organizations can identify subtle patterns that might evade detection otherwise. This method proves particularly useful for tasks such as analyzing attrition trends over several years, providing insights that contribute to reducing churn in sectors like finance.

    ML and AI

    Machine learning and artificial intelligence represent cutting-edge advancements in data mining. Advanced forms like deep learning excel in accurate predictions at scale, making them invaluable for AI deployments such as computer vision, speech recognition, and sophisticated text analytics using natural language processing. These techniques shine in extracting value from semi-structured and unstructured data.


    In data mining, each technique serves as a distinct tool for uncovering valuable insights. From the discernment of sequential patterns to the transparent predictability of decision trees, the foundational role of statistical techniques, and the dynamic clarity of visualizations, the array of methods presents a holistic approach. These techniques empower organizations to not only analyze data effectively but also to innovate strategically in an ever-evolving data landscape, ensuring they harness the full potential of their data for informed decision-making and transformative outcomes.

    Date: December 5, 2023

    Author: Anas Baig

    Source: Dataversity

  • Hoe onderscheiden data gedreven organisaties zich echt?

    We are data driven Image

    Je hoort het vaak in bestuurskamers: we willen een data-driven organisatie zijn. We willen aan de slag met IoT, (predictive) analytics of location based services. En ja, dat zijn sexy toepassingen. Maar wat zijn de werkelijke business drivers? Die blijven vaak onderbelicht. Onderzoek laat zien op welke terreinen organisaties met een hoge ‘datavolwassenheid’ vooroplopen.

    SAS ondervroeg bijna 600 beslissers en kon op basis van de antwoorden de respondenten onderverdelen in drie groepen: de koplopers, een middengroep en de achterblijvers. Zo ontstaat goed zicht op waarin de koplopers zich onderscheiden van de achterblijvers.

    Het eerste wat opvalt is de proactieve houding. Koplopers maken budget vrij om oude processen en systemen te vervangen en investeren in de uitdaging van data-integratie. Er heerst bovendien een cultuur van ‘continuous improvement’. Deze bedrijven zijn voortdurend actief op zoek naar verbetermogelijkheden. Dit in tegenstelling tot de achterblijvers, die pas willen investeren in verbeteringen als ze precies weten hoe hoog de ROI is.

    De koplopers vervangen hun oude systemen het vaakst door open source data platformen, waarbij Hadoop verreweg het meest populaire platform is. Behalve in technologie investeren deze bedrijven ook meer in het opschonen van data. Ze hebben goede processen ingericht om ervoor te zorgen dat data up-to-date en van de juiste kwaliteit is voor het beoogde gebruik. En ook de governance op deze processen is beter dan in de bedrijven die achterblijven (lees hier over het verhogen van de ROI op data en IT).

    Ook investeren koplopers meer in talent. 73 procent van deze bedrijven heeft een dedicated datateam dat wordt bezet met eigen mensen. De achterblijvers hebben vaker ofwel helemaal geen datateam ofwel een team dat wordt ingevuld door externe mensen. Koplopers investeren ook meer in werving en selectie van gekwalificeerd personeel. Daardoor ondervindt ‘slechts’ 38 procent van de koplopers een tekort aan interne vaardigheden, tegenover 62 procent van de achterblijvers.

    Dit alles leidt ertoe dat koplopers beter zijn voorbereid op de GDPR-regelgeving, die in 2018 zijn intrede doet.

    Ze zijn beter in staat om de risico’s te benoemen die verbonden zijn aan een data-driven strategie en ze hebben maatregelen genomen om deze risico’s af te dekken of te verkleinen.

    De komst van de GDPR is voor veel organisaties een aanleiding om te investeren in een goede datastrategie. Maar dit is niet de enige reden. Bedrijven met een hoge datavolwassenheid kunnen:

    • sneller ingewikkelde vragen beantwoorden
    • sneller beslissingen nemen
    • sneller innoveren en groeien
    • de klantervaring verbeteren
    • groei realiseren in omzet en marktaandeel
    • kortere time-to-market voor nieuwe producten en diensten realiseren
    • business processen optimaliseren
    • betere strategische plannen en rapportages maken

    Alle reden dus om écht in data governance en data management te investeren en niet alleen maar te roepen dat je organisatie data-driven is. 90 procent van de ondervraagden vindt zichzelf namelijk datagedreven, maar de realiteit is helaas minder rooskleurig.

    Interesse in de volledige onderzoeksresultaten?
    Download hier het rapport ‘How data-driven organisations are winning’.


    Bron: Rein Mertens (SAS)

    In: www.Analyticstoday.nl

  • Hoe waarde creatie met predictive analysis en datamining

    De groeiende hoeveelheid data brengt een stortvloed aan vragen met zich mee. De hoofdvraag is wat we met die data kunnen Data miningen betere diensten aangeboden kunnen worden en risico’s vermeden? Helaas blijft bij de meeste bedrijven die vraag onbeantwoord. Hoe kunnen bedrijven waarde aan data toevoegen en overgaan tot predictive analytics, machine learning en decision management?

    Predictive analytics: de glazen bol voor de business

    Via data mining worden verborgen patronen in gegevens zichtbaar waardoor de toekomst voorspeld kan worden. Bedrijven, wetenschappers en overheden gebruiken al tientallen jaren dit soort methoden om vanuit data inzichten voor toekomstige situaties te verkrijgen. Moderne bedrijven gebruiken data data mining en predictive analytics om onder andere fraude op te sporen, cybersecurity te voorkomen en voorraadbeheer te optimaliseren. Dankzij een iteratief analytisch proces brengen zij data, verkenning van de data en de inzet van de nieuwe inzichten uit de data samen.

    Data mining: business in de lead

    Decision management zorgt dat deze inzichten worden omgezet in acties in het operationele proces. De vraag is hoe dit proces binnen een bedrijf vorm te geven. Het begint altijd bij een vraag vanuit de business en eindigt bij een evaluatie van de acties. Hoe deze Analytical Life Cycle eruit ziet en welke vragen relevant zijn per branche, leest u in de Data Mining From A to Z: How to Discover Insights and Drive Better Opportunities.


    Naast dit model waarin duidelijk wordt hoe uw bedrijf dit proces kan inzetten, wordt dieper ingegaan op de rol van data mining in het stadium van onderzoek. Door dit verder uit te diepen via het onderstaande stappenplan kan nog meer waarde uit data worden gehaald.

    1. Business-vraag omvormen tot een analytische hypothese

    2. Data gereedmaken voor data mining

    3. Data verkennen

    4. Data in een model plaatsen

    Wilt u weten hoe uw bedrijf ook data in kan zetten om de vragen van morgen te kunnen beantwoorden en een betere service kan verlenen? Download dan “Data Mining From A to Z: How to Discover Insights and Drive Better Opportunities.”

  • New kid on the block in Market Intel

    radar ontvangers

    Market intelligence neemt een vlucht. Nu ondernemingen hun interne informatiehuishouding in toenemend mate in orde hebben

    gaat de aandacht (opnieuw?) uit naar de informatievoorziening met betrekking tot de markt van ondernemingen. Opnieuw? Ja, opnieuw!

    Als sinds de ’60er jaren staat het onderwerp midden in de belangstelling maar onder invloed van informatietechnologische ontwikkelingen werd het steeds naar de achtergrond gedrongen door aandacht voor de interne optimalisering van de informatiehuishouding. Executive informatiesystemen (een term uit de jaren ’80) leidde tot BI en BI tot DWH, ETL, Reporting en score carding. De toename van data op social media, het net en de mogelijkheden op het gebied van ongestructureerde data – data mining maar ook machine learning voedden nu opnieuw de aandacht voor toepassing van technologie bij het beter kennen van de bedrijfsomgeving. Het belang daarvan is dus niet veranderd maar de mogelijkheden nemen wel toe.

    Drie jaar geleden werd Hammer, market intelligence opgericht met als doel bedrijven van market intel te voorzien met gebruikmaking van moderne data technologie. Egbert Philips (Director van het in Arnhem gevestigde Hammer); “Wat betreft management informatie zou er minstens zo veel aandacht moeten zijn voor het kennen en doorgronden van markt en bedrijfsomgeving als voor de interne prestaties. Dit zou niet afhankelijk moeten zijn van technologische mogelijkheden. De ontwikkeling van data science en big data technologieën maken het wel mogelijk market intelligence beter en efficiënter in te richten. Daar richten we ons met Hammer op. We willen een partner zijn voor bedrijven die hun markten structureel willen kennen en doorgronden. Informatie technologie blijft daarbij een middel maar wel een heel belangrijk middel.”

    Hammer is weliswaar een jonge onderneming maar bepaald niet nieuw in het veld. De oprichters zijn reeds jarenlang actief in market intel en de toepassing daarvan in onder ander strategische planning vraagstukken. Hammer ondersteunt echter ook meer tactische beslissingen. Vraagstukken met betrekking tot pricing, sourcing/inkoop, het kiezen van distributiepartners, productontwikkeling en business development kunnen niet goed worden beantwoord zonder input van marktinformatie.

    Eind november organiseert Hammer een klantevent. Wanneer u geïnteresseerd bent stuur dan een e-mail naar info@hammer-intel.com



  • PwC's 18th Annual Global CEO Survey: Mobile, Data Mining & Analysis Most Strategically Important Technologies


    81% of CEOs see mobile technologies for customer engagement as the most strategically important in 2015. 80% see data mining and analysis technologies as the most strategically important, followed by cybersecurity (78%), Internet of Things (IoT) (65%), socially-enabled business processes (61%) and cloud computing (60%).

    These and many other research findings are part of PwC’s 18thAnnual Global CEO Survey. Released earlier today at the opening of the World Economic Forum Annual Meeting in Davos, Switzerland,the study provides insights into CEO’s current priorities and future plans across a wide spectrum of areas.
    PwC interviewed 1,322 CEOs in 77 countries, with 28% of the interviews conducted by telephone, 59% online and 13% via mailed questionnaires. PwC’s sample is selected based on the percentage of the total GDP of countries included in the survey, to ensure CEOs’ views are fairly represented across all major countries and regions of the world. Please see page 40 of the study for additional details of the methodology. The free PDF of PwC’s 18th Annual Global CEO Survey is pwc-18th-annual-global-ceo-survey-jan-2015.pdf here.
    Key Take-Aways
    CEOs are more confident in their enterprises’ business prospects in 2015 than they are about global economic growth. 39% of CEOs are very confident in their business prospects for 2014, surpassing the 37% confident in global economic growth. The following graphic compares the trending of these two metrics:

    unnamed 4

    Over-regulation (78%), availability of key skills (73%) and government response to fiscal deficit and debt burden (72%) are the top three threats CEOs are most concerned about. It’s interesting to note that five of the top then threats are technology-related. Availability of key skills, cyber threats including lack of data security, shift in consumer spending and behaviors, speed of technology change, and new market entrants are five threat areas that are being addressed by scalable, secure enterprise technologies today.

    unnamed 3

    Mobile technologies for customer engagement is the most strategically important series of technology CEOs are focusing on today. The report states that “the number of mobile phone users globally was expected to total 4.55B in 2014 – nearly 70% of the world’s population – with smartphone users totaling 1.75B. The volume of mobile traffic generated by smartphones is now about twice that of PCs, tablets and routers – despite having only surpassed them in 2013 – and is predicted to grow ten-fold by 2019”. The following graphic compares the strategic importance of key technologies.

    unnamed 2

    The majority of CEOs think that digital technologies have created high value for their organizations. CEOs cited areas including data and data analytics, customer experience, digital trust and innovation capacity as key areas where digital technologies are delivering value. Operational efficiency (88%), data and data analytics (84%) and customer experience (77%) are the top three priorities that CEOs are concentrating today. The first graphic compares the level of value being gained from each digital investment area, and the second provides an expanded analysis of selected digital technologies’ value.


    unnamed 1



    86% of CEOs realize they need to champion the use of digital technologies for their enterprises’ digital investments to succeed. Taking control and driving change management deep into their organizations by championing digital technologies is the most effective strategy for making digital investments pay off. The following graphic shows that CEOs realize how critical their roles are in overcoming resistance to the change that technologies bring into an organization. CEOs in 2015 will champion mobility, data and analysis to strengthen their organizations’ ability to compete in increasingly turbulent markets.



  • The Chief Data Officer - Who Are They and Why Companies Need one

    CDOData has become one of the core areas that companies are investing in at the moment, whether they are mature, on the journey or just embarking on data projects. Every mainstream magazine presented articles about data, big data, customer data etc. Citing that data is at the heart of every or most initiative whether it's to do with the customer experience, creating or enhancing a new product, streamlining the operational processes or getting more laser focused on their marketing. 

    With this new (or should I say ongoing) trend, there has emerged a new power hitter, the data supremo, the knight in shining data (that’s enough now this isn’t a boxing match!) – please welcome to the ring the new kid on the block in the C-Suite – the Chief Data Officer or CDO. 

    Let’s take a look under the covers at who the CDO is, and what they are tasked with in this brave new world.

    The New Frontier

    The characteristics of the CDO:

    1. A Leader (yes with a capital L) that knows their “true North” and can guide companies and corral people on a journey from data immaturity to being a data competitor
    2. An advocate for all things data from: data governance, to data analytics, to data architecture, to data insights and actions
    3. An individual that has the tenure of business, and can quickly understand business models and strategy and align these to the data / information strategy – supporting the strategy to ensure the business evolves into the vision that has been set
    4. An individual who has the knowledge of technical concepts, and the ability to set the technical data strategy and collaborate with technology colleagues (namely the CIO / CDO) to ensure alignment with the overall technology strategy
    5. An individual that can lead an organisation through deep change, rethinking the way they look at decisions and encourage a data-driven approach
    6. Most importantly, and this is purely my view – a delivery and execution focus with the confidence to try things in a small way and if they fail then rethink and attempt again. This is about failing fast – none of us have the true answer to this and without a little bit of experimentation the CDO will almost definitely fail!

    I am sure there are for more characteristics that may be focused around privacy, data security, agile etc. All of these are important, and certainly up there with the responsibilities. In some industries, such as the financial world and heavily regulated ones, there may well be Chief Information Security officers or Compliance personnel that can support the CDO. That leads me onto my next point. 

    A Seat at the Top Table

    I have a sneaky little feeling that most of you think the CDO should be sitting somewhere below the CIO / CTO or even CFO – wrong! In my humble opinion (well actually experience of working with a number of CDOs across different industries) – they should be reporting directly to the CEO and should be on the Board. Being able to set the strategic direction, and the ability to sit at the top table permits that level of confidence and authority to be voice of data within the company. Spelling out just what and where the company needs to go with data and how it’s going to get there – a roadmap if you will.  This is the guiding true north of the CDO – sitting at the top and being able to provide input at the top – down will ensure that agreement on data projects and the force behind them prevail – and not just fall down some miserable drain somewhere never to be seen again or for someone to say – “oh no we just weren’t ready for a CDO”. Once a CDO has been employed and onboarded, the company has to ensure that data is seen as a strategic business asset and with the intelligent use of data comes profitability, deeper customer understanding, better product outcomes, a better and slicker operations.

    If the CDO has the opportunity to be on par with the C-Suite then they can work with their counterparts more effectively to complement the IT Strategy (working with the CIO / CTO); understand the issues that they CMO is having working the marketing funnel and customer propensities to buy products etc.; being able to support the CFO by providing deeper insights into regulatory reporting and getting it faster out of the blocks – ensuring governance of data is permeated throughout.

    Gartner predicts that 90% of large organisations will have a Chief Data Officer role by 2019 

    The Journey Starts…in the first 100 days

    Meet and listen to your key Stakeholders: Just like presidents and prime ministers when they come into office, they have a 100-day plan (or you hope they do!). This is when they need to be at their most thoughtful, observant, curious and in deep listening mode. It isn’t the bulldozer effect here – this is where humility must be on offer – a stance of “I want to know where you have been” and “I want to listen to where you want to go” and most of all “I want to support you to get there”.  No assumptions or judgements should be in place here – leave your EGO at the door. 

    Where is the PAIN?: While meeting with key and wider stakeholders, start to understand their key issues / pain points / challenges with data. Understanding where companies haven’t had a good experience with data or have invested in technology for the sake of it and of little or no value etc. With this, the CDO can start to think about how their role can support in the objectives that others have been set – be it departmentally / locally / regionally / globally. With one of the CDOs we are working with now, they were set a very difficult task of supporting the company with very little resource and budget. Cobbling together what they could from project to project. As this grew, the issue became that business units would get frustrated by the lack of impetus or push from the CDO, that led to scepticism and frustration throughout the business. From all sides. This is really key for most organisations – empower the CDO – don’t give them a job that doesn’t have a mandate – what I’m really saying is prepare to win and not fail. The CDO in this case, thankfully, is now turning heads and being able to deliver real change and benefits in a business that is going through deep change. 

    Measures / KPIs and all things glossy: This is the real icing on the cake, the sponge and creamy bits of the cake that everyone wants. How to take the pulse of the organisation and start to understand what needs to be measured in the various departments, what are they measuring now, how are they getting to that data, which business questions are being asked, which aren’t being answered and which by their very nature have been put to the side as people don’t know how to answer them. Initiatives will also emerge from the multitude of conversations – some people will talk about a Single Customer view as being the holy grail, others will want operational efficiency, some will want to take products to market quicker, others will want to know how they can predict who will buy their products in the wild west! All of these are great places to start to think and build a plan of attack. 

    Roadmap time: This is where the plan starts to shape! The culmination of all the conversations, have started to sink in, there has been a ton of data to pick through both technical and business in nature. Supported as a high-level picture or document or even powerpoint deck (go easy on the number of slides) - these are the areas that the CDO needs to focus on after all of the conversations:

    Business Drivers & Goals – a deep understanding of the business strategy and where that particular roadmap is headed and how everything the CDO does, needs to align with that. 

    Business Intelligence /Analytical Maturity – a stake in the ground as to where the company is on their journey – the five areas being (just one of the many frameworks that can be used):

    • Stage 1 – BI / Analytically impaired
    • Stage 2 – Localised BI / Analytics
    • Stage 3 – BI / Analytics Aspirations
    • Stage 4 – BI / Analytics Company
    • Stage 5 – BI / Analytics Competitor

    Data Governance – are there standards, policies, processes and data owners across the organisation. The CDO needs to think about how to define these areas, and take the organisation on a journey where they can adopt principles of Master Data & Reference Data Management (being just one of the issues). Adopting a maturity metre to understand the structures that need to be put in place, will help the business understand the action that is needed – the maturity needs to focus on the following five areas and how to get there:

    • Stage 1 – No Data Governance
    • Stage 2 – Application Based Standards
    • Stage 3 – Data Stewardship
    • Stage 4 – Data Governance Council
    • Stage 5 – Enterprise Data Governance Program

    Resources & Organisation – we recently worked with an Insurance company that worked with data in isolation, and depended on a few key people to churn reports week in, week out! By churn I really do mean hand cranking them, by extracting and manipulating hundreds of spreadsheets – nightmare city! By taking a measured view across the organisation, we were able to find some astute data evangelists, analysts as well as consulting on who they should hire, the insurance company was able to centralise a team that would provide data and MI (as they liked to call it) to the business. Thereby, taking off the pressure of the few that had been sweating a lot! In this case they didn’t hire a CDO, however, the CDO when appointed needs to sniff around the business and dig out those people that can form what might become an Analytics Centre of Excellence, or a Business Intelligence Competency Centre. In another company, we were able to find these people in their business units and over time, helped the CDO to bring them into the CDO office – this helps as they have the business knowledge and can perform data analysis that someone from the outside wouldn’t be able to do as quick – true insights supported by their data analysts and then able to take action as they have the credibility of knowing the business.  The main idea here is to move to a point of true centralised or hub and spoke model to support the business, so that data is talked about at the coffee machine- I mean what else would you talk about at the coffee machine right? Grin

    Tools & technology – last and by no means least – the various investments or divestments that need to be made. What is working, what isn’t, what isn’t aligned to the overall strategy, how many BI applications are being used across the organisation, what platforms are required and so on. This is the cool stuff that centres around visualisation to support data storytelling, predictive analytics and all things that will drive those initiatives and give people the freedom to enjoy data and being able to self-serve all they want. This could take into account if the company needed to look at Hadoop or next generation AI / machine learning and to have that on the roadmap or whether a data warehouse will be suffice to being with, and how they integrate key data to answer those business questions.

    As stated the main artefacts that are produced – the Data or Information Strategy that outlines what will be done to reach the 100-days mark. A communications document / slideware that puts the roadmap on a page and explains the high-level.  The evangelical movement begins.

    Once the roadmap has been accepted, the CDO needs to start walking the walk, taking that action and measuring the results. There need to be regular updates with the key stakeholders, to ensure they know what has been delivered, what will be delivered, when it will be delivered and how much value is being delivered by all the initiatives – this is the real rub – the impact that the strategy has had. 

    "There is no division where you can't add value by using data." Davide Cervellin, eBay Head of EU Analytics 

    It’s a Brave New World

    Not every company out there will need a CDO, but the tide is turning and without the focus on data that is crucial to every business now – companies that don’t appoint this role may well be left behind. The competitive advantage that a CDO brings to the top table is one that is gaining ground and is much needed in times of differentiation and innovation. 

    More and more companies will continue to generate more and more data from all sources – customers, sensors, products, social etc. and the difference could be the CDO, the data strategy and aligning with the business strategy. Being able to transform, equip the business with the data they need, creating value by experimentation, not being afraid to fail, breaking through the scepticism, will win the CDO a lot of friends, but most of all, the CEO will be blessed with a C-Suite that can break the mould of the old – “how things have always been done!”

    Author: Samir Sharma



  • The Top 5 Trends in Big Data for 2017

    Last year the big data market centered squarely on technology around the Hadoop ecosystem. Since then, it’s been all about ‘putting big data to work’ thro

    top 5ugh use cases shown to generate ROI from increased revenue and productivity and lower risk.

    Now, big data continues its march beyond the crater. Next year we can expect to see more mainstream companies adopting big data and IoT, with traditionally conservative and skeptic organizations starting to take the plunge.

    Data blending will be more important compared to a few years ago when we were just getting started with Hadoop. The combination of social data, mobile apps, CRM records and purchase histories via advanced analytics platforms allow marketers a glimpse into the future by bringing hidden patterns and valuable insights on current and future buying behaviors into light.

    The spread of self-service data analytics, along with widespread adoption of the cloud and Hadoop, are creating industry-wide change that businesses will either take advantage of or ignore at their peril. The reality is that the tools are still emerging, and the promise of the (Hadoop) platform is not at the level it needs to be for business to rely on it.

    As we move forward, there will be five key trends shaping the world of big -Data:

    The Internet of Things (IoT)

    Businesses are increasingly looking to derive value from all data; large industrial companies that make, move, sell and support physical things are plugging sensors attached to their ‘things’ into the Internet. Organizations will have to adapt technologies to map with IoT data. This presents countless new challenges and opportunities in the areas of data governance, standards, health and safety, security and supply chain, to name a few.

    IoT and big data are two sides of the same coin; billions of internet-connected 'things' will generate massive amounts of data. However, that in itself won't usher in another industrial revolution, transform day-to-day digital living, or deliver a planet-saving early warning system. Data from outside the device is the way enterprises can differentiate themselves. Capturing and analyzing this type of data in context can unlock new possibilities for businesses.

    Research has indicated that predictive maintenance can generate savings of up to 12 percent over scheduled repairs, leading to a 30 percent reduction in maintenance costs and a 70 percent cut in downtime from equipment breakdowns. For a manufacturing plant or a transport company, achieving these results from data-driven decisions can add up to significant operational improvements and savings opportunities.

    Deep Learning

    Deep learning, a set of machine-learning techniques based on neural networking, is still evolving, but shows great potential for solving business problems. It enables computers to recognize items of interest in large quantities of unstructured and binary data, and to deduce relationships without needing specific models or programming instructions.

    These algorithms are largely motivated by the field of artificial intelligence, which has the general goal of emulating the human brain’s ability to observe, analyze, learn, and make decisions, especially for extremely complex problems. A key concept underlying deep learning methods is distributed representations of the data, in which a large number of possible configurations of the abstract features of the input data are feasible, allowing for a compact representation of each sample and leading to a richer generalization.

    Deep learning is primarily useful for learning from large amounts of unlabeled/unsupervised data, making it attractive for extracting meaningful representations and patterns from Big Data. For example, it could be used to recognize many different kinds of data, such as the shapes, colors and objects in a video — or even the presence of a cat within images, as a neural network built by Google famously did in 2012.

    As a result, the enterprise will likely see more attention placed on semi-supervised or unsupervised training algorithms to handle the large influx of data.

    In-Memory Analytics

    Unlike conventional business intelligence (BI) software that runs queries against data stored on server hard drives, in-memory technology queries information loaded into RAM, which can significantly accelerate analytical performance by reducing or even eliminating disk I/O bottlenecks. With big data, it is the availability of terabyte systems and massive parallel processing that makes in-memory more interesting.

    At this stage of the game, big data analytics is really about discovery. Running iterations to see correlations between data points doesn't happen without millisec

    onds of latency, multiplied by millions/billions of iterations. Working in memory is at three orders of magnitude faster than going to disk.

    In 2014, Gartner coined the term HTAP - Hybrid Transaction/Analytic Processing, to describe a new technology that allows transactions and analytic processing to reside in the same in-memory database. It allows application leaders to innovate via greater situation awareness and improved business agility, however entails an upheaval in the established architectures, technologies and skills driven by use of in-memory computing technologies as enablers.

    Many businesses are already leveraging hybrid transaction/analytical processing (HTAP); for example, retailers are able to quickly identify items that are trending as bestsellers within the past hour and immediately create customized offers for that item.

    But there’s a lot of hype around HTAP, and businesses have been overusing it. For systems where the user needs to see the same data in the same way many times during the day, and there’s no significant change in the data, in-memory is a waste of money. And while you can perform analytics faster with HTAP, all of the transactions must reside within the same database. The problem is, that most analytics efforts today are about putting transactions from many different systems together.

    It’s all on Cloud

    Hybrid and public cloud services continue to rise in popularity, with investors claiming their stakes. The key to big data success is in running the (Hadoop) platform on an elastic infrastructure.

    We will see the convergence of data storage and analytics, resulting in new smarter storage systems that will be optimized for storing, managing and sorting massive petabytes of data sets. Going forward, we can expect to see the cloud-based big data ecosystem continue its momentum in the overall market at more than just the “early adopter” margin.

    Companies want a platform that allows them to scale, something that cannot be delivered through a heavy investment on a data center that is frozen in time. For example, the Human Genome Project started as a gigabyte-scale project but quickly got into terabyte and petabyte scale. Some of the leading enterprises have already begun to split workloads in a bi-modal fashion and run some data workloads in the cloud. Many expect this to accelerate strongly as these solutions move further along the adoption cycle.


    There is a big emphasis on APIs to unlock data and capabilities in a reusable way, with many companies looking to run their APIs in the cloud and in the data center. On-premises APIs offer a seamless way to unlock legacy systems and connect them with cloud applications, which is crucial for businesses that want to make a cloud-first strategy a reality.

    More businesses will run their APIs in the cloud, providing elasticity to better cope with spikes in demand and make efficient connections, enabling them to adopt and innovate faster than competition.

    Apache Spark

    Apache Spark is lighting up big data. The popular Apache Spark project provides Spark Streaming to handle processing in near real time through a mostly in-memory, micro-batching approach. It has moved from being a component of the Hadoop ecosystem to the big data platform of choice for a number of enterprises.

    Now the largest big data open source project, Spark provides dramatically increased data processing speed compared to Hadoop, and as a result, is much more natural, mathematical, and convenient for programmers. It provides an efficient, general-purpose framework for parallel execution.

    Spark Streaming, which is the prime part of Spark, is used to stream large chunks of data with help from the core by breaking the large data into smaller packets and then transforming them, thereby accelerating the creation of the RDD. This is very useful in today’s world where data analysis often requires the resources of a fleet of machines working together.

    However, it’s important to note that Spark is meant to enhance, not replace, the Hadoop stack. In order to gain even greater value from big data, companies consider using Hadoop and Spark together for better analytics and storage capabilities.

    Increasingly sophisticated big data demands means the pressure to innovate will remain high. If they haven’t already, businesses will begin to see that cus

    tomer success is a data job. Companies that are not capitalizing on data analytics will start to go out of business, with successful enterprises realizing that the key to growth is data refinement and predictive analytics.

    Information Management, 2016; Brad Chivukala

  • Waarom een leergang Business Data Scientist

    data scientist

    Elke organisatie die veranderd, is op zoek. Die zoektocht heeft vaak betrekking op data: hoe kunnen we data beter toepas

    sen? Hoe kunnen we nieuwe toepassingen voor data vinden? Hebben we wel de juiste data? Wat moeten we doen met data science en big data? Hoe kunnen we data inzetten om betere besluiten te nemen en dus ook beter te presteren?

    Organisaties moeten een antwoord vinden op deze vragen. Deels onder druk van de verder ontwikkelende markt en veranderende concurrentie. Daarmee krijgt data een centrale plaats in de bedrijfsvoering en worden organisaties dus 'data driven'.

    Uiteraard heb je hier 'data science' voor nodig: de omgevingen en vaardigheden om data te ontleden, analyseren en te vertalen naar modellen, adviezen en besluiten.

    We hebben de leergang business data scientist ontworpen omdat geen bedrijf met alleen tools en technieken succesvol gaat worden. Het is juist de business data scientist die de brug vormt tussen data science en de verandering die in organisaties plaats vindt.

    Te vaak ligt bij organisaties de nadruk op de technologie (Hadoop? Spark? Data lake? Moeten we R leren?). Om succesvol te zijn met data heb je ook andere instrumenten nodig. Bedrijfskundige modellen, business analyse, strategievorming helpen om de juiste vragen te formuleren en doelen te stellen. Softskills en veranderkundige vaardigheden om die doelen zichtbaar te maken voor opdrachtgevers en stakeholders. Kennis van data science, architectuur, methoden en organisatiemodellen geeft de inzichten om data science in een organisatie in te passen. Visie en leiderschap is nodig om data sc

    ience in een organisatie te laten werken. Ons doel is om deelnemers dit te laten zien. De opleiding is ontworpen om al deze aspecten samen te laten komen en bruikbare instrumenten te geven.

    Wat ik het leukste vind van deze leergang? Steeds weer de brug maken naar: wat gaan we nu doen, hoe breng je de theorie in praktijk brengen. Elke deel theorie wordt vertaald naar een praktische toepassing in de casus. En dat is de eerste stap naar het halen van successen met data science in je eigen werk, team, afdeling, divisie of organisatie.

    Meer weten? Interesse? Op 28 november is er een thema avond over de Business Data Scientist in Utrecht. Aanmelden kan via de Radboud Management Academy!

    Deze blog is verschenen op www.businessdecision.nl.

    Auteur: Alex Aalberts


  • Werkgevers voorspellen wie er ziek wordt met big data

    ziekenhuis ANP 0Steeds meer bedrijven in de VS werken samen met zorgverzekeraars en partijen die gezondheidsdata verzamelen om erachter te komen welke werknemers risico lopen om ziek te worden.

    Daarover schrijft The Wall Street Journal. Om de ziektekosten in de hand te houden slaan sommige bedrijven de handen ineen met bedrijven die allerlei gegevens van werknemers verzamelen en verwerken. Daaruit kunnen ze dan bijvoorbeeld opmaken wie er risico loopt om diabetes te krijgen.

    Op basis van die informatie krijgen werknemers dan bericht dat ze eens naar de dokter zouden moeten gaan of hun gewoontes moeten aanpassen. ''Je kunt beter voorspellen wat het risico is dat iemand een hartaanval krijgt door te letten waar hij of zij winkelt dan door op chromosomen te letten'', zegt Harry Greenspun van Deloitte's Center for Health Solutions, tegen de Wall Street Journal.

    Uiteenlopende informatie
    De directeur van zo'n dataminingbedrijf vertelt dat ze de meest uiteenlopende informatie kunnen gebruiken om de gezondheid van werknemers te voorspellen. Dat loopt uiteen van waar je inkopen doet tot de kredietwaardigheid van werknemers. Mensen met weinig geld, zo is de redenering, zijn eerder geneigd om niet de juiste medicijnen aan te schaffen, als de dokter dat adviseert.

    Een ander bedrijf, Castlight, biedt een platform voor werknemers om zorgverzekeraars te vergelijken. Op basis van de data die zij daarmee verzamelen hebben zij onlangs zelfs een product ontwikkeld dat kan voorspellen of een vrouwelijke werknemer binnenkort zwanger wordt. Dat weten ze op basis van een aantal persoonskenmerken, die zij vervolgens weer koppelen informatie uit declaratieverzoeken bij de zorgverzekeraar. Declareer je opeens de pil niet meer, ben je 30 en heb je al een kind? Dan gaan de alarmbellen af.

    Ook zegt het veel over je als je, in de VS, gaat stemmen voor de verkiezingen voor het Huis van Afgevaardigden of de Senaat. Mensen die dit doen zijn over het algemeen mobieler en actiever en dat zegt veel over je algemene gezondheid.

    Enorme risico's of juist handig?
    Niet iedereen is enthousiast over de verzamelwoede van gegevens door verzekeraars en bedrijven. Frank Pasquale, professor aan de Universiteit van Maryland, zegt tegenover de Wall Street Journal 'enorme potentiële risico's' te zien.

    Sommige werknemers zijn onaangenaam verrast als blijkt dat ze op basis van medische tests een medisch advies van hun werkgever krijgen. Maar het kan ze wel aanzetten hun gedrag te wijzigen en zo de kans op een bepaalde aandoening te verminderen. Zo laat de krant een vrouwelijke werknemer aan het woord die gevaar liep om diabetes te krijgen. Nu is ze flink afgevallen en is het risico afgenomen.

    Source: RTL Z

EasyTagCloud v2.8