68 items tagged "data science"

  • ‘Vooruitgang in BI, maar let op ROI’

    5601405Business intelligence (bi) werd door Gartner al benoemd tot hoogste prioriteit voor de cio in 2016. Ook de Computable-experts voorspellen dat er veel en grote stappen genomen gaan worden binnen de bi. Tegelijkertijd moeten managers ook terug kijken en nadenken over hun businessmodel bij de inzet van big data: hoe rechtvaardig je de investeringen in big data?

    Kurt de Koning, oprichter van Dutch Offshore ICT Management
    Business intelligence/analytics is door Gartner op nummer één gezet voor 2016 op de prioriteitenlijst voor de cio. Gebruikers zullen in 2016 hun beslissingen steeds meer laten afhangen van stuurinformatie die uit meerdere bronnen komt. Deze bronnen zullen deels bestaan uit ongestructureerde data. De bi-tools zullen dus niet alleen visueel de informatie aantrekkelijk moeten opmaken en een goede gebruikersinterface moeten bieden. Bij het ontsluiten van de data zullen die tools zich onderscheiden , die in staat zijn om orde en overzicht te scheppen uit de vele verschijningsvormen van data.

    Laurent Koelink, senior interim BI professional bij Insight BI
    Big data-oplossingen naast traditionele bi
    Door de groei van het aantal smart devices hebben organisaties steeds meer data te verwerken. Omdat inzicht (in de breedste zin) een van de belangrijkste succesfactoren van de toekomst gaat zijn voor veel organisaties die flexibel in willen kunnen spelen op de vraag van de markt, zullen zijn ook al deze nieuwe (vormen) van informatie moeten kunnen analyseren. Ik zie big data niet als vervangen van traditionele bi-oplossingen, maar eerder als aanvulling waar het gaat om analytische verwerking van grote hoeveelheden (vooral ongestructureerde) data.

    In-memory-oplossingen
    Organisaties lopen steeds vaker aan tegen de performance-beperkingen van traditionele database systemen als het gaat om grote hoeveelheden data die ad hoc moeten kunnen worden geanalyseerd. Specifieke hybride database/hardware-oplossingen zoals die van IBM, SAP en TeraData hebben hier altijd oplossingen voor geboden. Daar komen nu steeds vaker ook in-memory-oplossingen bij. Enerzijds omdat deze steeds betaalbaarder en dus toegankelijker worden, anderzijds doordat dit soort oplossingen in de cloud beschikbaar komen, waardoor de kosten hiervan goed in de hand te houden zijn.

    Virtual data integration
    Daar waar data nu nog vaak fysiek wordt samengevoegd in aparte databases (data warehouses) zal dit, waar mogelijk, worden vervangen door slimme metadata-oplossingen, die (al dan niet met tijdelijke physieke , soms in memory opslag) tijdrovende data extractie en integratie processen overbodig maken.

    Agile BI development
    Organisaties worden meer en meer genoodzaakt om flexibel mee te bewegen in en met de keten waar ze zich in begeven. Dit betekent dat ook de inzichten om de bedrijfsvoering aan te sturen (de bi-oplossingen) flexibel moeten mee bewegen. Dit vergt een andere manier van ontwikkelen van de bi-ontwikkelteams. Meer en meer zie je dan ook dat methoden als Scrum ook voor bi-ontwikkeling worden toegepast.

    Bi voor de iedereen
    Daar waar bi toch vooral altijd het domein van organisaties is geweest zie je dat ook consumenten steeds meer en vaker gebruik maken van bi-oplossingen. Bekende voorbeelden zijn inzicht in financiën en energieverbruik. De analyse van inkomsten en uitgaven op de webportal of in de app van je bank, maar ook de analyse van de gegevens van slimme energiemeters zijn hierbij sprekende voorbeelden. Dit zal in de komende jaren alleen maar toenemen en geïntegreerd worden.

    Rein Mertens, head of analytical platform bij SAS
    Een belangrijke trend die ik tot volwassenheid zie komen in 2016 is ‘streaming analytics’. Vandaag de dag is big data niet meer weg te denken uit onze dagelijkse praktijk. De hoeveelheid data welke per seconde wordt gegenereerd blijft maar toenemen. Zowel in de persoonlijke als zakelijke sfeer. Kijk maar eens naar je dagelijkse gebruik van het internet, e-mails, tweets, blog posts, en overige sociale netwerken. En vanuit de zakelijke kant: klantinteracties, aankopen, customer service calls, promotie via sms/sociale netwerken et cetera.

    Een toename van volume, variatie en snelheid van vijf Exabytes per twee dagen wereldwijd. Dit getal is zelfs exclusief data vanuit sensoren, en overige IoT-devices. Er zit vast interessante informatie verstopt in het analyseren van al deze data, maar hoe doe je dat? Een manier is om deze data toegankelijk te maken en op te slaan in een kosteneffectief big data-platform. Onvermijdelijk komt een technologie als Hadoop dan aan de orde, om vervolgens met data visualisatie en geavanceerde analytics aan de gang te gaan om verbanden en inzichten uit die data berg te halen. Je stuurt als het ware de complexe logica naar de data toe. Zonder de data allemaal uit het Hadoop cluster te hoeven halen uiteraard.

    Maar wat nu, als je op basis van deze grote hoeveelheden data ‘real-time’ slimme beslissingen zou willen nemen? Je hebt dan geen tijd om de data eerst op te slaan, en vervolgens te gaan analyseren. Nee, je wilt de data in-stream direct kunnen beoordelen, aggregeren, bijhouden, en analyseren, zoals vreemde transactie patronen te detecteren, sentiment in teksten te analyseren en hierop direct actie te ondernemen. Eigenlijk stuur je de data langs de logica! Logica, die in-memory staat en ontwikkeld is om dat heel snel en heel slim te doen. En uiteindelijke resultaten op te slaan. Voorbeelden van meer dan honderdduizend transacties zijn geen uitzondering hier. Per seconde, welteverstaan. Stream it, score it, store it. Dat is streaming analytics!

    Minne Sluis, oprichter van Sluis Results
    Van IoT (internet of things) naar IoE (internet of everything)
    Alles wordt digitaal en connected. Meer nog dan dat we ons zelfs korte tijd geleden konden voorstellen. De toepassing van big data-methodieken en -technieken zal derhalve een nog grotere vlucht nemen.

    Roep om adequate Data Governance zal toenemen
    Hoewel het in de nieuwe wereld draait om loslaten, vertrouwen/vrijheid geven en co-creatie, zal de roep om beheersbaarheid toch toenemen. Mits vooral aangevlogen vanuit een faciliterende rol en zorgdragend voor meer eenduidigheid en betrouwbaarheid, bepaald geen slechte zaak.

    De business impact van big data & data science neemt toe
    De impact van big data & data science om business processen, diensten en producten her-uit te vinden, verregaand te digitaliseren (en intelligenter te maken), of in sommige gevallen te elimineren, zal doorzetten.

    Consumentisering van analytics zet door
    Sterk verbeterde en echt intuïtieve visualisaties, geschraagd door goede meta-modellen, dus data governance, drijft deze ontwikkeling. Democratisering en onafhankelijkheid van derden (anders dan zelfgekozen afgenomen uit de cloud) wordt daarmee steeds meer werkelijkheid.

    Big data & data science gaan helemaal doorbreken in de non-profit
    De subtiele doelstellingen van de non-profit, zoals verbetering van kwaliteit, (patiënt/cliënt/burger) veiligheid, punctualiteit en toegankelijkheid, vragen om big data toepassingen. Immers, voor die subtiliteit heb je meer goede informatie en dus data, sneller, met meer detail en schakering nodig, dan wat er nu veelal nog uit de traditionelere bi-omgevingen komt. Als de non-profit de broodnodige focus van de profit sector, op ‘winst’ en ‘omzetverbetering’, weet te vertalen naar haar eigen situatie, dan staan succesvolle big data initiatieven om de hoek! Mind you, deze voorspelling geldt uiteraard ook onverkort voor de zorg.

    Hans Geurtsen, business intelligence architect data solutions bij Info Support
    Van big data naar polyglot persistence
    In 2016 hebben we het niet meer over big, maar gewoon over data. Data van allerlei soorten en in allerlei volumes die om verschillende soorten opslag vragen: polyglot persistence. Programmeurs kennen de term polyglot al lang. Een applicatie anno 2015 wordt vaak al in meerdere talen geschreven. Maar ook aan de opslag kant van een applicatie is het niet meer alleen relationeel wat de klok zal slaan. We zullen steeds meer andere soorten databases toepassen in onze data oplossingen, zoals graph databases, document databases, etc. Naast specialisten die alles van één soort database afweten, heb je dan ook generalisten nodig die precies weten welke database zich waarvoor leent.

    De doorbraak van het moderne datawarehouse
    ‘Een polyglot is iemand met een hoge graad van taalbeheersing in verschillende talen’, aldus Wikipedia. Het gaat dan om spreektalen, maar ook in het it-vakgebied, kom je de term steeds vaker tegen. Een applicatie die in meerdere programmeertalen wordt gecodeerd en data in meerdere soorten databases opslaat. Maar ook aan de business intelligence-kant volstaat één taal, één omgeving niet meer. De dagen van het traditionele datawarehouse met een etl-straatje, een centraal datawarehouse en één of twee bi-tools zijn geteld. We zullen nieuwe soorten data-platformen gaan zien waarin allerlei gegevens uit allerlei bronnen toegankelijk worden voor informatiewerkers en data scientists die allerlei tools gebruiken.

    Business intelligence in de cloud
    Waar vooral Nederlandse bedrijven nog steeds terughoudend zijn waar het de cloud betreft, zie je langzaam maar zeker dat de beweging richting cloud ingezet wordt. Steeds meer bedrijven realiseren zich dat met name security in de cloud vaak beter geregeld is dan dat ze zelf kunnen regelen. Ook cloud leveranciers doen steeds meer om Europese bedrijven naar hun cloud te krijgen. De nieuwe data centra van Microsoft in Duitsland waarbij niet Microsoft maar Deutsche Telekom de controle en toegang tot klantgegevens regelt, is daar een voorbeeld van. 2016 kan wel eens hét jaar worden waarin de cloud écht doorbreekt en waarin we ook in Nederland steeds meer complete BI oplossingen in de cloud zullen gaan zien.

    Huub Hillege, principal data(base) management consultant bij Info-Shunt
    Big data
    De big data-hype zal zich nog zeker voortzetten in 2016 alleen het succes bij de bedrijven is op voorhand niet gegarandeerd. Bedrijven en pas afgestudeerden blijven elkaar gek maken over de toepassing. Het is onbegrijpelijk dat iedereen maar Facebook, Twitter en dergelijke data wil gaan ontsluiten terwijl de data in deze systemen hoogst onbetrouwbaar is. Op elke conferentie vraag ik waar de business case, inclusief baten en lasten is, die alle investeringen rondom big data rechtvaardigen. Zelfs bi-managers van bedrijven moedigen aan om gewoon te beginnen. Dus eigenlijk: achterom kijken naar de data die je hebt of kunt krijgen en onderzoeken of je iets vindt waar je iets aan zou kunnen hebben. Voor mij is dit de grootste valkuil, zoals het ook was met de start van Datawarehouses in 1992. Bedrijven hebben in de huidige omstandigheden beperkt geld. Zuinigheid is geboden.

    De analyse van big data moet op de toekomst zijn gericht vanuit een duidelijke business-strategie en een kosten/baten-analyse: welke data heb ik nodig om de toekomst te ondersteunen? Bepaal daarbij:

    • Waar wil ik naar toe?
    • Welke klantensegmenten wil ik erbij krijgen?
    • Gaan we met de huidige klanten meer 'Cross selling' (meer producten) uitvoeren?
    • Gaan we stappen ondernemen om onze klanten te behouden (Churn)?

    Als deze vragen met prioriteiten zijn vastgelegd moet er een analyse worden gedaan:

    • Welke data/sources hebben we hierbij nodig?
    • Hebben we zelf de data, zijn er 'gaten' of moeten we externe data inkopen?

    Databasemanagementsysteem
    Steeds meer databasemanagementsysteem (dbms)-leveranciers gaan ondersteuning geven voor big data-oplossingen zoals bijvoorbeeld Oracle/Sun Big Data Appliance, Teradata/Teradata Aster met ondersteuning voor Hadoop. De dbms-oplossingen zullen op de lange termijn het veld domineren. big data-software-oplossingen zonder dbms zullen het uiteindelijk verliezen.

    Steeds minder mensen, ook huidige dbma's, begrijpen niet meer hoe het technisch diep binnen een database/DBMS in elkaar zit. Steeds meer zie je dat fysieke databases uit logische data modelleer-tools worden gegeneerd. Formele fysieke database-stappen/-rapporten blijven achterwege. Ook ontwikkelaars die gebruik maken van etl-tools zoals Informatica, AbInitio, Infosphere, Pentaho et cetera, genereren uiteindelijk sgl-scripts die data van sources naar operationele datastores en/of datawarehouse brengen.

    Ook de bi-tools zoals Microstrategy, Business Objects, Tableau et cetera genereren sql-statements.
    Meestal zijn dergelijke tools initieel ontwikkeld voor een zeker dbms en al gauw denkt men dat het dan voor alle dbms'en toepasbaar is. Er wordt dan te weinig gebruik gemaakt van specifieke fysieke dbms-kenmerken.

    De afwezigheid van de echte kennis veroorzaakt dan performance problemen die in een te laat stadium worden ontdekt. De laatste jaren heb ik door verandering van databaseontwerp/indexen en het herstructureren van complexe/gegenereerde sql-scripts, etl-processen van zes tot acht uur naar één minuut kunnen krijgen en queries die 45 tot 48 uur liepen uiteindelijk naar 35 tot veertig minuten kunnen krijgen.

    Advies
    De benodigde data zal steeds meer groeien. Vergeet de aanschaf van allerlei hype software pakketten. Zorg dat je zeer grote, goede, technische, Database-/dbms-expertise in huis haalt om de basis van onderen goed in te richten in de kracht van je aanwezige dbms. Dan komt er tijd en geld vrij (je kan met kleinere systemen uit de voeten omdat de basis goed in elkaar zit) om, na een goede business case en ‘proof of concepts’, de juiste tools te selecteren.

  • 4 Tips om doodbloedende Big Data projecten te voorkomen

    projectmanagers

    Investeren in big data betekent het verschil tussen aantrekken of afstoten van klanten, tussen winst of verlies. Veel retailers zien hun initiatieven op het vlak van data en analytics echter doodbloeden. Hoe creëer je daadwerkelijk waarde uit data en voorkom je een opheffingsuitverkoop? Vier tips.

    Je investeert veel tijd en geld in big data, exact volgens de boodschap die retailgoeroes al enkele jaren verkondigen. Een team van data scientists ontwikkelt complexe datamodellen, die inderdaad interessante inzichten opleveren. Met kleine ‘proofs of value’ constateert u dat die inzichten daadwerkelijk ten gelde kunnen worden gemaakt. Toch gebeurt dat vervolgens niet. Wat is er aan de hand?

    Tip 1: Pas de targets aan

    Dat waardevolle inzichten niet in praktijk worden gebracht, heeft vaak te maken met de targets die uw medewerkers hebben meegekregen. Neem als voorbeeld het versturen van mailingen aan klanten. Op basis van bestaande data en klantprofielen kunnen we goed voorspellen hoe vaak en met welke boodschap elke klant moet worden gemaild. En stiekem weet elke marketeer donders goed dat niet elke klant op een dagelijkse email zit te wachten.

    Toch trapt menigeen in de valkuil en stuurt telkens weer opnieuw een mailing uit naar het hele klantenbestand. Het resultaat: de interesse van een klant ebt snel weg en de boodschap komt niet langer aan. Waarom doen marketeers dat? Omdat ze louter en alleen worden afgerekend op de omzet die ze genereren, niet op de klanttevredenheid die ze realiseren. Dat nodigt uit om iedereen zo vaak mogelijk te mailen. Op korte termijn groeit met elk extra mailtje immers de kans op een verkoop.

    Tip 2: Plaats de analisten in de business

    Steeds weer zetten retailers het team van analisten bij elkaar in een kamer, soms zelfs als onderdeel

    van de IT-afdeling. De afstand tot de mensen uit de business die de inzichten in praktijk moeten brengen, is groot. En te vaak blijkt die afstand onoverbrugbaar. Dat leidt tot misverstanden, onbegrepen analisten en waardevolle inzichten die onbenut blijven.

    Beter is om de analisten samen met de mensen uit de business bij elkaar te zetten in multidisciplinaire teams, die werken met scrum-achtige technieken. Organisaties die succesvol zijn, beseffen dat ze continu in verandering moeten zijn en werken in dat soort teams. Dat betekent dat business managers in een vroegtijdig stadium worden betrokken bij de bouw van datamodellen, zodat analisten en de business van elkaar kunnen leren. Klantkennis zit immers in data én in mensen.

    Tip 3: Neem een business analist in dienst

    Data-analisten halen hun werkplezier vooral uit het maken van fraaie analyses en het opstellen van goede, misschien zelfs overontwikkelde datamodellen. Voor hun voldoening is het vaak niet eens nodig om de inzichten uit die modellen in praktijk te brengen. Veel analisten zijn daarom ook niet goed in het interpreteren van data en het vertalen daarvan naar de concrete impact op de retailer. 

    Het kan verstandig zijn om daarom een business analist in te zetten. Dat is iemand die voldoende affiniteit heeft met analytics en enigszins snapt hoe datamodellen tot stand komen, maar ook weet wat de uitdagingen van de business managers zijn. Hij kan de kloof tussen analytics en business overbruggen door vragen uit de business te concretiseren en door inzichten uit datamodellen te vertalen naar kansen voor de retailer.

    Tip 4: Analytics is een proces, geen project

    Nog te veel retailers kijken naar alle inspanningen op het gebied van data en analytics alsof het een project met een kop en een staart betreft. Een project waarvan vooraf duidelijk moet zijn wat het gaat opleveren. Dat is vooral het geval bij retailorganisaties die worden geleid door managers uit de ‘oude generatie’ die onvoldoende gevoel en affiniteit met de nieuwe wereld hebben Het commitment van deze managers neemt snel af als investeringen in data en analytics niet snel genoeg resultaat opleveren.

    Analytics is echter geen project, maar een proces waarin retailers met vallen en opstaan steeds handiger en slimmer worden. Een proces waarvan de uitkomst vooraf onduidelijk is, maar dat wel moet worden opgestart om vooruit te komen. Want alle ontwikkelingen in de retailmarkt maken één ding duidelijk: stilstand is achteruitgang.

    Auteur: EY, Simon van Ulden, 5 oktober 2016

  • A look at the major trends driving next generation datacenters

    Data centers have become a core component of modern living, by containing and distributing the information required to participate in everything from social life to economy. In 2017, data centers consumed 3 percent of the world’s electricity, and new technologies are only increasing their energy demand. The growth of high-performance computing — as well as answers to growing cyber-security threats and efficiency concerns — are dictating the development of the next generation of data centers.

    But what will these new data centers need in order to overcome the challenges the industry faces? Here is a look at 5 major trends that will impact data center design in the future.

    1. Hyperscale functionality

    The largest companies in the world are increasingly consolidating computing power in massive, highly efficient hyperscale data centers that can keep up with the increasing demands of enterprise applications. These powerful data centers are mostly owned by tech giants like Amazon or Facebook, and there are currently around 490 of them in existence with more than 100 more in development. It’s estimated that these behemoths will contain more than 50 percent of all data that passes through data centers by 2021, as companies take advantage of their immense capabilities to implement modern business intelligence solutions and grapple with the computing requirements of the Internet of Things (IoT).

    2. Liquid efficiency

    The efficiency of data centers is both an environmental concern and a large-scale economic issue for operators. Enterprises in diverse industries from automotive design to financial forecasting are implementing and relying on machine-learning in their applications, which results in more expensive and high-temperature data center infrastructure. It’s widely known that power and cooling represent the biggest costs that data center owners have to contend with, but new technologies are emerging to combat this threat. Liquid cooling is swiftly becoming more popular for those building new data centers, because of its incredible efficiency and its ability to future-proof data centers against the increasing heat being generated by demand for high-performance computing. The market is expected to grow to $2.5 billion by 2025 as a result.

    3. AI monitoring

    Monitoring software that implements the critical advances made in machine learning and artificial intelligence is one of the most successful technologies that data center operators have put into practice to improve efficiency. Machines are much more capable of reading and predicting the needs of data centers second to second than their human counterparts, and with their assistance operators can manipulate cooling solutions and power usage in order to dramatically increase energy efficiency.

    4. DNA storage

    In the two-year span between 2015 and 2017, more data was created than in all of preceding history. As this exponential growth continues, we may soon see the sheer quantity of data outstrip the ability of hard drives to capture it. But researchers are exploring the possibility of storing this immense amount of data within DNA, as it is said that a single gram of DNA is capable of storing 215 million gigabytes of information. DNA storage could provide a viable solution to the limitations of encoding on silicon storage devices, and meet the requirements of an ever-increasing number of data centers despite land constraints near urban areas. But it comes with its own drawbacks. Although it has improved considerably, it is still expensive and extremely slow to write data to DNA. Furthermore, getting data back from DNA involves sequencing it, and decoding files and finding / retrieving specific files stored on DNA is a major challenge. However, according to Microsoft research data, algorithms currently being developed may lower the cost of sequencing and synthesizing DNA plunge to levels that make it feasible in the future.

    5. Dynamic security

    The average cost of a cyber-attack to the impacted businesses will be more than $150 million by 2020, and data centers are at the center of the modern data security fight. Colocation facilities have to contend with the security protocols of multiple customers, and the march of data into the cloud means that hackers can gain access to it through multiple devices or applications. New physical and cloud security features are going to be critical for the evolution of the data center industry, including biometric security measures on-site to prevent physical access by even the most committed thieves or hackers. More strict security guidelines for cloud applications and on-site data storage will be a major competitive advantage for the most effective data center operators going forward as cyber-attacks grow more costly and more frequent. The digital economy is growing more dense and complex every single day, and data center builders and operators need to upgrade and build with the rising demand for artificial intelligence and machine learning in mind. This will make it necessary for greener, more automated, more efficient and more secure data centers to be able to safely host the services of the next generation of digital companies.

    Author: Gavin Flynn

    Source: Information-management

  • A new quantum approach to big data

    MIT-Quantum-Big-Data 0From gene mapping to space exploration, humanity continues to generate ever-larger sets of data — far more information than people can actually process, manage, or understand.
    Machine learning systems can help researchers deal with this ever-growing flood of information. Some of the most powerful of these analytical tools are based on a strange branch of geometry called topology, which deals with properties that stay the same even when something is bent and stretched every which way.


    Such topological systems are especially useful for analyzing the connections in complex networks, such as the internal wiring of the brain, the U.S. power grid, or the global interconnections of the Internet. But even with the most powerful modern supercomputers, such problems remain daunting and impractical to solve. Now, a new approach that would use quantum computers to streamline these problems has been developed by researchers at MIT, the University of Waterloo, and the University of Southern California.
    The team describes their theoretical proposal this week in the journal Nature Communications. Seth Lloyd, the paper’s lead author and the Nam P. Suh Professor of Mechanical Engineering, explains that algebraic topology is key to the new method. This approach, he says, helps to reduce the impact of the inevitable distortions that arise every time someone collects data about the real world.


    In a topological description, basic features of the data (How many holes does it have? How are the different parts connected?) are considered the same no matter how much they are stretched, compressed, or distorted. Lloyd explains that it is often these fundamental topological attributes “that are important in trying to reconstruct the underlying patterns in the real world that the data are supposed to represent.”


    It doesn’t matter what kind of dataset is being analyzed, he says. The topological approach to looking for connections and holes “works whether it’s an actual physical hole, or the data represents a logical argument and there’s a hole in the argument. This will find both kinds of holes.”
    Using conventional computers, that approach is too demanding for all but the simplest situations. Topological analysis “represents a crucial way of getting at the significant features of the data, but it’s computationally very expensive,” Lloyd says. “This is where quantum mechanics kicks in.” The new quantum-based approach, he says, could exponentially speed up such calculations.


    Lloyd offers an example to illustrate that potential speedup: If you have a dataset with 300 points, a conventional approach to analyzing all the topological features in that system would require “a computer the size of the universe,” he says. That is, it would take 2300 (two to the 300th power) processing units — approximately the number of all the particles in the universe. In other words, the problem is simply not solvable in that way.
    “That’s where our algorithm kicks in,” he says. Solving the same problem with the new system, using a quantum computer, would require just 300 quantum bits — and a device this size may be achieved in the next few years, according to Lloyd.


    “Our algorithm shows that you don’t need a big quantum computer to kick some serious topological butt,” he says.
    There are many important kinds of huge datasets where the quantum-topological approach could be useful, Lloyd says, for example understanding interconnections in the brain. “By applying topological analysis to datasets gleaned by electroencephalography or functional MRI, you can reveal the complex connectivity and topology of the sequences of firing neurons that underlie our thought processes,” he says.


    The same approach could be used for analyzing many other kinds of information. “You could apply it to the world’s economy, or to social networks, or almost any system that involves long-range transport of goods or information,” says Lloyd, who holds a joint appointment as a professor of physics. But the limits of classical computation have prevented such approaches from being applied before.


    While this work is theoretical, “experimentalists have already contacted us about trying prototypes,” he says. “You could find the topology of simple structures on a very simple quantum computer. People are trying proof-of-concept experiments.”


    Ignacio Cirac, a professor at the Max Planck Institute of Quantum Optics in Munich, Germany, who was not involved in this research, calls it “a very original idea, and I think that it has a great potential.” He adds “I guess that it has to be further developed and adapted to particular problems. In any case, I think that this is top-quality research.”
    The team also included Silvano Garnerone of the University of Waterloo in Ontario, Canada, and Paolo Zanardi of the Center for Quantum Information Science and Technology at the University of Southern California. The work was supported by the Army Research Office, Air Force Office of Scientific Research, Defense Advanced Research Projects Agency, Multidisciplinary University Research Initiative of the Office of Naval Research, and the National Science Foundation.

    Source:MIT news

  • A Shortcut Guide to Machine Learning and AI in The Enterprise

    advanced-predictive-proactive-etc-Two-men-fighting

    Predictive analytics / machine learning / artificial intelligence is a hot topic – what’s it about?

    Using algorithms to help make better decisions has been the “next big thing in analytics” for over 25 years. It has been used in key areas such as fraud the entire time. But it’s now become a full-throated mainstream business meme that features in every enterprise software keynote — although the industry is battling with what to call it.

    It appears that terms like Data Mining, Predictive Analytics, and Advanced Analytics are considered too geeky or old for industry marketers and headline writers. The term Cognitive Computing seemed to be poised to win, but IBM’s strong association with the term may have backfired — journalists and analysts want to use language that is independent of any particular company. Currently, the growing consensus seems to be to use Machine Learning when talking about the technology and Artificial Intelligence when talking about the business uses.

    Whatever we call it, it’s generally proposed in two different forms: either as an extension to existing platforms for data analysts; or as new embedded functionality in diverse business applications such as sales lead scoring, marketing optimization, sorting HR resumes, or financial invoice matching.

    Why is it taking off now, and what’s changing?

    Artificial intelligence is now taking off because there’s a lot more data available and affordable, powerful systems to crunch through it all. It’s also much easier to get access to powerful algorithm-based software in the form of open-source products or embedded as a service in enterprise platforms.

    Organizations today have also more comfortable with manipulating business data, with a new generation of business analysts aspiring to become “citizen data scientists.” Enterprises can take their traditional analytics to the next level using these new tools.

    However, we’re now at the “Peak of Inflated Expectations” for these technologies according to Gartner’s Hype Cycle — we will soon see articles pushing back on the more exaggerated claims. Over the next few years, we will find out the limitations of these technologies even as they start bringing real-world benefits.

    What are the longer-term implications?

    First, easier-to-use predictive analytics engines are blurring the gap between “everyday analytics” and the data science team. A “factory” approach to creating, deploying, and maintaining predictive models means data scientists can have greater impact. And sophisticated business users can now access some the power of these algorithms without having to become data scientists themselves.

    Second, every business application will include some predictive functionality, automating any areas where there are “repeatable decisions.” It is hard to think of a business process that could not be improved in this way, with big implications in terms of both efficiency and white-collar employment.

    Third, applications will use these algorithms on themselves to create “self-improving” platforms that get easier to use and more powerful over time (akin to how each new semi-autonomous-driving Tesla car can learn something new and pass it onto the rest of the fleet).

    Fourth, over time, business processes, applications, and workflows may have to be rethought. If algorithms are available as a core part of business platforms, we can provide people with new paths through typical business questions such as “What’s happening now? What do I need to know? What do you recommend? What should I always do? What can I expect to happen? What can I avoid? What do I need to do right now?”

    Fifth, implementing all the above will involve deep and worrying moral questions in terms of data privacy and allowing algorithms to make decisions that affect people and society. There will undoubtedly be many scandals and missteps before the right rules and practices are in place.

    What first steps should companies be taking in this area?
    As usual, the barriers to business benefit are more likely to be cultural than technical.

    Above all, organizations need to make sure they have the right technical expertise to be able to navigate the confusion of new vendors offers, the right business knowledge to know where best to apply them, and the awareness that their technology choices may have unforeseen moral implications.

    Source: timoelliot.com, October 24, 2016

     

  • About how Uber and Netflex turn Big Data into real business value

    client-logo-netflix-logo-png-netflix-logo-png-netflix-logo-qlHSS6-clipart

    From the way we go about our daily lives to the way we treat cancer and protect our society from threats, big data will transform every industry, every aspect of our lives. We can say this with authority because it is already happening.

    Some believe big data is a fad, but they could not be more wrong. The hype will fade, and even the name may disappear, but the implications will resonate and the phenomenon will only gather momentum. What we currently call big data today will simply be the norm in just a few years’ time.

    Big data refers generally to the collection and utilization of large or diverse volumes of data. In my work as a consultant, I work every day with companies and government organizations on big data projects that allow them to collect, store, and analyze the ever-increasing volumes of data to help improve what they do.

    In the course of that work, I’ve seen many companies doing things wrong — and a few getting big data very right, including Netflix and Uber.

    Netflix: Changing the way we watch TV and movies

    The streaming movie and TV service Netflix are said to account for one-third of peak-time Internet traffic in the US, and the service now have 65 million members in over 50 countries enjoying more than 100 million hours of TV shows and movies a day. Data from these millions of subscribers is collected and monitored in an attempt to understand our viewing habits. But Netflix’s data isn’t just “big” in the literal sense. It is the combination of this data with cutting-edge analytical techniques that makes Netflix a true Big Data company.

    Although Big Data is used across every aspect of the Netflix business, their holy grail has always been to predict what customers will enjoy watching. Big Data analytics is the fuel that fires the “recommendation engines” designed to serve this purpose.

    At first, analysts were limited by the lack of information they had on their customers. As soon as streaming became the primary delivery method, many new data points on their customers became accessible. This new data enabled Netflix to build models to predict the perfect storm situation of customers consistently being served with movies they would enjoy.

    Happy customers, after all, are far more likely to continue their subscriptions.

    Another central element to Netflix’s attempt to give us films we will enjoy is tagging. The company pay people to watch movies and then tag them with elements the movies contain. They will then suggest you watch other productions that were tagged similarly to those you enjoyed. 

    Netflix’s letter to shareholders in April 2015 shows their Big Data strategy was paying off. They added 4.9 million new subscribers in Q1 2015, compared to four million in the same period in 2014. In Q1 2015 alone, Netflix members streamed 10 billion hours of content. If Netflix’s Big Data strategy continues to evolve, that number is set to increase.

    Uber: Disrupting car services in the sharing economy

    Uber is a smartphone app-based taxi booking service which connects users who need to get somewhere with drivers willing to give them a ride. 

    Uber’s entire business model is based on the very Big Data principle of crowdsourcing: anyone with a car who is willing to help someone get to where they want to go can offer to help get them there. This gives greater choice for those who live in areas where there is little public transport, and helps to cut the number of cars on our busy streets by pooling journeys.

    Uber stores and monitors data on every journey their users take, and use it to determine demand, allocate resources and set fares. The company also carry out in-depth analysis of public transport networks in the cities they serve, so they can focus coverage in poorly served areas and provide links to buses and trains.

    Uber holds a vast database of drivers in all of the cities they cover, so when a passenger asks for a ride, they can instantly match you with the most suitable drivers. The company have developed algorithms to monitor traffic conditions and journey times in real time, meaning prices can be adjusted as demand for rides changes, and traffic conditions mean journeys are likely to take longer. This encourages more drivers to get behind the wheel when they are needed – and stay at home when demand is low. 

    The company have applied for a patent on this method of Big Data-informed pricing, which they call “surge pricing”. This is an implementation of “dynamic pricing” – similar to that used by hotel chains and airlines to adjust price to meet demand – although rather than simply increasing prices at weekends or during public holidays it uses predictive modelling to estimate demand in real time.

    Data also drives (pardon the pun) the company’s UberPool service. According to Uber’s blog, introducing this service became a no-brainer when their data told them the “vast majority of [Uber trips in New York] have a look-a-like trip – a trip that starts near, ends near and is happening around the same time as another trip”. 

    Other initiatives either trialed or due to launch in the future include UberChopper, offering helicopter rides to the wealthy, Uber-Fresh for grocery deliveries and Uber Rush, a package courier service.

    These are just two companies using Big Data to generate a very real advantage and disrupt their markets in incredible ways. I’ve compiled dozens more examples of Big Data in practice in my new book of the same name, in the hope that it will inspire and motivate more companies to similarly innovate and take their fields into the future. 

    Thank you for reading my post. Here at LinkedIn and at Forbes I regularly write about management, technology and Big Data. If you would like to read my future posts then please click 'Follow' and feel free to also connect via TwitterFacebookSlideshare, and The Advanced Performance Institute.

    You might also be interested in my new and free ebook on Big Data in Practice, which includes 3 Amazing use cases from NASA, Dominos Pizza and the NFL. You can download the ebook from here: Big Data in Practice eBook.

    Author: Bernard Marr

    Source: Linkedin Blog

  • An overview of Morgan Stanley's surge toward data quality

    An overview of Morgan Stanley's surge toward data quality

    Jeff McMillan, chief analytics and data officer at Morgan Stanley, has long worried about the risks of relying solely on data. If the data put into an institution's system is inaccurate or out of date, it will give customers the wrong advice. At a firm like Morgan Stanley, that just isn't an option.

    As a result, Morgan Stanley has been overhauling its approach to data. Chief among them is that it wants to improve data quality in core business processing.

    “The acceleration of data volume and the opportunity this data presents for efficiency and product innovation is expanding dramatically,” said Gerard Hester, head of the bank’s data center of excellence. “We want to be sure we are ahead of the game.”

    The data center of excellence was established in 2018. Hester describes it as a hub with spokes out to all parts of the organization, including equities, fixed income, research, banking, investment management, wealth management, legal, compliance, risk, finance and operations. Each division has its own data requirements.

    “Being able to pull all this data together across the firm we think will help Morgan Stanley’s franchise internally as well as the product we can offer to our clients,” Hester said.

    The firm hopes that improved data quality will let the bank build higher quality artificial intelligence and machine learning tools to deliver insights and guide business decisions. One product expected to benefit from this is the 'next best action' the bank developed for its financial advisers.

    This next best action uses machine learning and predictive analytics to analyze research reports and market data, identify investment possibilities, and match them to individual clients’ preferences. Financial advisers can choose to use the next best action’s suggestions or not.

    Another tool that could benefit from better data is an internal virtual assistant called 'ask research'. Ask research provides quick answers to routine questions like, “What’s Google’s earnings per share?” or “Send me your latest model for Google.” This technology is currently being tested in several departments, including wealth management.

    New data strategy

    Better data quality is just one of the goals of the revamp. Another is to have tighter control and oversight over where and how data is being used, and to ensure the right data is being used to deliver new products to clients.

    To make this happen, the bank recently created a new data strategy with three pillar. The first is working with each business area to understand their data issues and begin to address those issues.

    “We have made significant progress in the last nine months working with a number of our businesses, specifically our equities business,” Hester said.

    The second pillar is tools and innovation that improve data access and security. The third pillar is an identity framework.

    At the end of February, the bank hired Liezel McCord to oversee data policy within the new strategy. Until recently, McCord was an external consultant helping Morgan Stanley with its Brexit strategy. One of McCord’s responsibilities will be to improve data ownership, to hold data owners accountable when the data they create is wrong and to give them credit when it’s right.

    “It’s incredibly important that we have clear ownership of the data,” Hester said. “Imagine you’re joining lots of pieces of data. If the quality isn’t high for one of those sources of data, that could undermine the work you’re trying to do.”

    Data owners will be held accountable for the accuracy, security and quality of the data they contribute and make sure that any issues are addressed.

    Trend of data quality projects

    Arindam Choudhury, the banking and capital markets leader at Capgemini, said many banks are refocusing on data as it gets distributed in new applications.

    Some are driven by regulatory concerns, he said. For example, the Basel Committee on Banking Supervision's standard number 239 (principles for effective risk data aggregation and risk reporting) is pushing some institutions to make data management changes.

    “In the first go-round, people complied with it, but as point-to-point interfaces and applications, which was not very cost effective,” Choudhury said. “So now people are looking at moving to the cloud or a data lake, they’re looking at a more rationalized way and a more cost-effective way of implementing those principles.”

    Another trend pushing banks to get their data house in order is competition from fintechs.

    “One challenge that almost every financial services organization has today is they’re being disintermediated by a lot of the fintechs, so they’re looking at assets that can be used to either partner with these fintechs or protect or even grow their business,” Choudhury said. “So they’re taking a closer look at the data access they have. Organizations are starting to look at data as a strategic asset and try to find ways to monetize it.”

    A third driver is the desire for better analytics and reports.

    "There’s a strong trend toward centralizing and figuring out, where does this data come from, what is the provenance of this data, who touched it, what kinds of rules did we apply to it?” Choudhury said. That, he said, could lead to explainable, valid and trustworthy AI.

    Author: Penny Crosman

    Source: Information-management

  • Big Data Analytics: hype?

    Big DAta explosion

    Er gaat momenteel geen dag voorbij of er is in de media wel een bericht of discussie te vinden rond data. Of het nu gaat om vraagstukken rond privacy, nieuwe mogelijkheden en bedreigingen van Big Data, of nieuwe diensten gebaseerd op het slim combineren en uitwisselen van gegevens: je kunt er niet onderuit dat informatie ‘hot’ is. 

    Is Big Data Analytics - ofwel de analyse van grote hoeveelheden data, veelal ongestructureerd - een hype? Toen de term enkele jaren geleden opeens overal opdook zeiden veel sceptici dat het een truc was van software leveranciers om iets bestaands - data analyse wordt al lang toegepast - opnieuw te vermarkten. Inmiddels zijn alle experts het er over eens dat Big Data Analytics in de vorm waarin het nu kan worden toegepast een enorme impact gaat hebben op de wereld zoals wij die kennen. Ja, het is een hype, maar wel een terechte.

    Big Data Analytics – wat is dat nou eigenlijk?

    Big Data is al jaren een hype, en zal dat nog wel even blijven. Wanneer is er nou sprake van ‘Big’ Data, bij hoeveel tera-, peta- of yottabytes (1024) ligt de grens tussen ‘Normal’ en ‘Big’ Data? Het antwoord is: er is geen duidelijke grens. Je spreekt van Big Data als het te veel wordt voor jouw mensen en middelen. Big Data Analytics richt zich op de exploratie van data middels statistische methoden om nieuwe inzichten op te doen waarmee de toekomstige prestaties verbeterd kunnen worden. 

    Big Data Analytics als stuurmiddel voor prestaties is al volop in gebruik bij bedrijven. Denk aan een sportclub die het inzet om te bepalen welke spelers ze gaan kopen. Of een bank die gestopt is alleen talenten te rekruteren van topuniversiteiten omdat bleek dat kandidaten van minder prestigieuze universiteiten het beter deden. Of bijvoorbeeld een verzekeringsmaatschappij die het gebruikt om fraude te detecteren. Enzovoorts. Enzovoorts. 

    Wat maakt Big Data Analytics mogelijk? 

    Tenminste drie ontwikkelingen zorgen ervoor dat Big Data Analytics een hele nieuwe fase ingaat. 

    1. Rekenkracht 

    De toenemende rekenkracht van computers stelt analisten in staat om enorme datasets te gebruiken, en een groot aantal variabelen te gebruiken in hun analyses. Door de toegenomen rekenkracht is het niet langer nodig om een steekproef te nemen zoals vroeger, maar kan alle data gebruikt worden voor een analyse. De analyse kan worden gedaan met behulp van specifieke tools en vereist vaak specifieke kennis en vaardigheden van de gebruiker, een data analist of data scientist. 

    2. Datacreatie 

    Het internet en social media zorgen ervoor dat de hoeveelheid data die we creëren exponentieel toeneemt. Deze data is inzetbaar voor talloze data-analyse toepassingen, waarvan de meeste nog bedacht moeten worden. 

    Om een beeld te krijgen van de datagroei, overweeg deze statistieken: 

    - Meer dan een miljard tweets worden iedere 48 uur verstuurd.

    - Dagelijks komen een miljoen Twitter accounts bij.

    - Iedere 60 seconden worden er 293.000 status updates gepost op facebook.

    - De gemiddelde Facebook gebruiker creëert 90 stukken content per maand, inclusief links, nieuws, verhalen, foto’s en video’s. 

    - Elke minuut komen er 500 Facebook accounts bij. 

    - Iedere dag worden 350 miljoen foto’s geupload op facebook, wat neerkomt op 4.000 foto’s per seconde.

    - Als Wikipedia een boek zou zijn, zou het meer dan twee miljard pagina’s omvatten. 

    Bron: http://www.iacpsocialmedia.org

    3. Dataopslag 

    De kosten voor het opslaan van data zijn sterk afgenomen de afgelopen jaren, wat de mogelijkheden om analytics toe te passen heeft doen groeien. Een voorbeeld is de opslag van videobeelden. Beveiligingscamera’s in een supermarkt namen eerst alles op tape op. Als er na drie dagen niks gebeurd was werd de band teruggespoeld en werd er opnieuw over opgenomen.  

    Dat is niet langer nodig. Een supermarkt kan nu digitale beelden - die de hele winkel vastleggen - naar de cloud versturen waar ze blijven opgeslagen. Vervolgens is het mogelijk analytics op deze beelden toe te passen: welke promoties werken goed? Voor welke schappen blijven mensen lang staan? Wat zijn de blinde hoeken in de winkel? Of predictive analytics: Stel dat we dit product in dit schap zouden leggen, wat zou het resultaat dan zijn? Deze analyses kan het management gebruiken om tot een optimale winkelinrichting te komen en maximaal rendement uit promoties te halen.  

    Betekenis Big Data Analytics

    Big Data - of Smart Data - zoals Bernard Marr, auteur van het nieuwe praktische boek ‘Big Data: Using SMART Big Data Analytics To Make Better Decisions and Improve Performance’ - het liever noemt is de wereld aan het veranderen. De hoeveelheid data neemt exponentieel toe momenteel, maar de hoeveelheid is voor de meeste beslissers grotendeels irrelevant. Het gaat erom hoe men het inzet om te komen tot waardevolle inzichten.  

    Big Data 

    De meningen zijn verdeeld over wat big data nou precies is. Gartner definieert big data vanuit de drie V’s Volume, Velocity en Variety. Het gaat dus om de hoeveelheid data, de snelheid waarmee de data verwerkt kan worden en de diversiteit van de data. Met dit laatste wordt bedoeld dat de data, naast gestructureerde bronnen, ook uit allerlei ongestructureerde bronnen gehaald kan worden, zoals internet en social media, inclusief tekst, spraak en beeldmateriaal.

    Analytics

    Wie zou niet de toekomst willen voorspellen? Met voldoende data, de juiste technologie en een dosis wiskunde komt dat binnen bereik. Dit wordt business analytics genoemd, maar er zijn veel andere termen in omloop, zoals data science, machine learning en, jawel, big data. Ondanks dat deze wiskunde al vrij lang bestaat, is het nog een relatief nieuw vakgebied dat tot voor kort alleen voor gespecialiseerde bedrijven met veel geld bereikbaar was.

    Toch maken we er zonder het te weten allemaal al gebruik van. Spraakherkenning op je telefoon, virusscanners op je PC en spamfilters voor email zijn gebaseerd op concepten die in het domein van business analytics vallen. Ook de ontwikkeling van zelfrijdende auto’s en alle stapjes daarnaartoe (adaptive cruise control, lane departure system, et cetera) zijn alleen mogelijk door machine learning. 

    Analytics is kortom de ontdekking en de communicatie van zinvolle patronen in data. Bedrijven kunnen analytics toepassen op zakelijke gegevens om hun bedrijfsprestaties te beschrijven, voorspellen en verbeteren. Er zijn verschillende soorten analytics, zoals tekst-analytics, spraak-analytics en video-analytics. 

    Een voorbeeld van tekst-analytics is een advocatenfirma die hiermee duizenden documenten doorzoekt om zo snel de benodigde informatie te vinden ter voorbereiding van een nieuwe zaak. Speech-analytics worden bijvoorbeeld gebruikt in callcenters om vast te stellen wat de stemming van de beller is, zodat de medewerker hier zo goed mogelijk op kan anticiperen. Video-analytics kan gebruikt worden voor het monitoren van beveiligingscamera’s. Vreemde patronen worden er zo uitgepikt, waarop beveiligingsmensen in actie kunnen komen. Ze hoeven nu zelf niet langer uren naar het scherm te staren terwijl er niks gebeurt.  

    Het proces kan zowel top-down als bottom-up benaderd worden. De meest toegepaste benaderingen zijn: 

    • Datamining: Dataonderzoek op basis van een gerichte vraag, waarin men op zoek gaat naar een specifiek antwoord.
    • Trend-analyse en predictive analytics: Door gericht op zoek te gaan naar oorzaak-gevolg verbanden om bepaalde gebeurtenissen te kunnen verklaren of om toekomstig gedrag te voorspellen.
    • Data discovery: Data onderzoeken op onverwachte verbanden of andere opvallende zaken.

    Feiten en dimensies

    De data die helpen om inzichten te verkrijgen of besluiten te nemen zijn feiten. Bijvoorbeeld EBITDA, omzet of aantal klanten. Deze feiten krijgen waarde door dimensies. De omzet over het jaar 2014 voor de productlijn babyvoeding in de Regio Oost. Door met dimensies te gaan analyseren kun je verbanden ontdekken, trends benoemen en voorspellingen doen voor de toekomst.

    Analytics versus Business Intelligence

    Waarin verschilt analytics nu van business intelligence (BI)? In feite is analytics op data gebaseerde ondersteuning van de besluitvorming. BI toont wat er gebeurd is op basis van historische gegevens die gepresenteerd worden in vooraf bepaalde rapporten. Waar BI inzicht geeft in het verleden, focust analytics zich op de toekomst. Analytics vertelt wat er kan gaan gebeuren door op basis van de dagelijks veranderende datastroom met ‘wat als’- scenario’s inschattingen te maken en risico’s en trends te voorspellen.

    Voorbeelden Big Data Analytics

    De wereld wordt steeds slimmer. Alles is meetbaar, van onze hartslag tijdens een rondje joggen tot de looppatronen in winkels. Door die data te gebruiken, kunnen we indrukwekkende analyses maken om bijvoorbeeld filevorming te voorkomen, epidemieën voortijdig te onderdrukken en medicijnen op maat aan te bieden.

    Deze evolutie is zelfs zichtbaar in de meest traditionele industrieën, zoals de visserij. In plaats van - zoals vanouds - puur te vertrouwen op een kompas en ‘insider knowledge’ doorgegeven door generaties vissersfamilies, koppelt de hedendaagse visser sensoren aan vissen en worden scholen opgespoord met de meest geavanceerde GPS-systemen. Big Data Analytics wordt inmiddels toegepast in alle industrieën en sectoren. Ook steden maken er gebruik van. Hieronder een overzicht van mogelijke toepassingen:

    Doelgroep beter begrijpen

    De Amerikaanse mega retailer Target weet door een combinatie van 25 aankopen wanneer een vrouw zwanger is. Dat is één van de weinige perioden in een mensenleven waarin koopgedrag afwijkt van routines. Hier speelt Target slim op in met baby-gerelateerde aanbiedingen. Amazon is zo goed geworden in predictive analytics dat ze producten al naar naar je toe kunnen sturen voordat je ze gekocht hebt. Als het aan hun ligt, kun je je bestelling binnenkort middels een drone binnen 30 minuten bezorgd krijgen.

    Processen verbeteren 

    Processen veranderen ook door Big Data. Bijvoorbeeld inkoop. Walmart weet dat er meer ‘Pop Tarts’ verkocht worden bij een stormwaarschuwing. Ze weten niet waarom dat is, maar ze zorgen er wel voor dat ze voldoende voorraad hebben en de snacks een mooie plek in de winkel geven. Een ander proces waar data grote kansen biedt voor optimalisatie is de supply chain. Welke routes laat je chauffeurs rijden en in welke volgorde laat je ze bestellingen afleveren? Real-time weer- en verkeerdata zorgt voor bijsturing. 

    Business optimalisatie

    Bij Q-Park betalen klanten per minuut voor parkeren, maar het is ook mogelijk een abonnement af te nemen. De prijs per minuut is bij een abonnement vele malen goedkoper. Als de garage vol begint te raken, is het vervelend als er net een klant met abonnement aan komt rijden, want dat kost omzet. Het analytics systeem berekent daarom periodiek de optimale mix van abonnementsplekken en niet abonnementsplekken op basis van historische gegevens. Zo haalt de garage exploitant het maximale eruit wat eruit te halen valt. 

    Optimalisatie machines 

    General Electric (GE) is een enthousiast gebruiker van big data. Het conglomeraat gebruikt al veel data in haar data-intensieve sectoren, zoals gezondheidszorg en financiële dienstverlening, maar het bedrijf ziet ook industriële toepassingen, zoals in GE’s businesses voor locomotieven, straalmotoren en gasturbines. GE typeert de apparaten in bedrijfstakken als deze ook wel als ‘dingen die draaien’ en verwacht dat de meeste van die dingen, zo niet alle, binnenkort gegevens over dat ‘draaien’ kunnen vastleggen en communiceren. 

    Een van die draaiende dingen is de gasturbine die de klanten van GE gebruiken voor energieopwekking. GE monitort nu al meer dan 1500 turbines vanuit een centrale faciliteit, dus een groot deel van de infrastructuur voor gebruik van big data om de prestaties te verbeteren is er al. GE schat dat het de efficiëntie van de gemonitorde turbines met minstens 1 procent kan verbeteren via software en netwerkoptimalisatie, doeltreffender afhandelen van onderhoud en betere harmonisering van het gas-energiesysteem. dat lijkt misschien niet veel, maar het zou neerkomen op een brandstofbesparing van 66 miljard dollar in de komende 15 jaar.
    (bron: 'Big Data aan het werk' door Thomas Davenport)

    Klantenservice en commercie

    Een grote winst van de nieuwe mogelijkheden van big data voor bedrijven is dat ze alles aan elkaar kunnen verbinden; silo’s, systemen, producten, klanten, enzovoorts. Binnen de telecom hebben ze bijvoorbeeld het cost-to-serve-concept geïntroduceerd. Daarmee kunnen zij vanuit de daadwerkelijke operatie kijken wat voor contactpunten ze met de klant hebben; hoe vaak hij belt met de klantenservice; wat zijn betaalgedrag is; hoe hij zijn abonnement gebruikt; hoe hij is binnengekomen; hoe lang hij klant is; waar hij woont en werkt; welke telefoon hij gebruikt; et cetera. 

    Wanneer het telecombedrijf de data van al die invalshoeken bij elkaar brengt, ontstaat er opeens een hele andere kijk op de kosten en omzet van die klant. In die veelheid van gezichtspunten liggen mogelijkheden. Alleen al door data te integreren en in context te bekijken, ontstaan gegarandeerd verrassende nieuwe inzichten. Waar bedrijven nu typisch naar kijken is de top 10 klanten die het meeste en minste bijdragen aan de omzet. Daar trekken ze dan een streep tussen. Dat is een zeer beperkte toepassing van de beschikbare data. Door de context te schetsen kan het bedrijf wellicht acties bedenken waarmee ze de onderste 10 kunnen enthousiasmeren iets meer te doen. Of er alsnog afscheid van nemen, maar dan weloverwogen.

    Slimme steden

    New York City maakt tegenwoordig gebruik van een ‘soundscape’ van de hele stad. Een verstoring in het typische stadsgeluid, zoals bijvoorbeeld een pistoolschot, wordt direct doorgegeven aan de politie die er op af kunnen. Criminelen gaan een moeilijke eeuw tegemoet door de toepassing van dergelijke Big Data Analytics. 

    Slimme ziekenhuizen

    Of het nu gaat om de informatie die gedurende een opname van een patiënt wordt verzameld of informatie uit de algemene jaarrapporten: Big Data wordt voor ziekenhuizen steeds belangrijker voor verbeterde patiëntenzorg, beter wetenschappelijk onderzoek en bedrijfsmatige informatie. Medische data verdubbelen iedere vijf jaar in volume. Deze gegevens kunnen van grote waarde zijn voor het leveren van de juiste zorg.

    HR Analytics

    Data kan worden aangewend om de prestaties van medewerkers te monitoren en te beoordelen. Dit geldt niet alleen voor de werknemers van bedrijven, maar zal ook steeds vaker worden toegepast om de toplaag van managers en leiders objectief te kunnen beoordelen. 

    Een bedrijf dat de vruchten heeft geplukt van HR Analytics is Google. De internet- en techgigant had nooit het geloof dat managers veel impact hadden, dus ging het analyticsteam aan de slag met de vraag: ‘Hebben managers eigenlijk een positieve impact bij Google?’ Hun analyse wees uit dat managers wel degelijk verschil maken en een positieve impact kunnen hebben bij Google. De volgende vraag was: ‘Wat maakt een geweldige manager bij Google?’ Dit resulteerde in 8 gedragingen van de beste managers en de 3 grootste valkuilen. Dit heeft geleid tot een zeer effectief training en feedback programma voor managers dat een hele positieve invloed heeft gehad op de performance van Google.  

    Big Data Analytics in het MKB

    Een veelgehoorde misvatting over Big Data is dat het alleen iets is voor grote bedrijven. Fout, want ieder bedrijf van groot naar klein kan data inzetten. Bernard Marr geeft in zijn boek een voorbeeld van een kleine mode retail onderneming waar hij mee samen heeft gewerkt. 

    De onderneming in kwestie wilden hun sales verhogen. Ze hadden alleen geen data om dit doel te bereiken op de traditionele sales data na. Ze bedachten toen eerst een aantal vragen:

    - Hoeveel mensen passeren onze winkels?

    - Hoeveel stoppen er om in de etalage te kijken en voor hoe lang?

    - Hoeveel komen vervolgens binnen?

    - Hoeveel kopen dan iets? 

    Vervolgens hebben ze een klein discreet apparaat achter het raam geplaatst dat het aantal passerende mobiele telefoons (en daarmee mensen) is gaan meten. Het apparaat legt ook vast hoeveel mensen voor de etalage blijven staan en voor hoe lang, en hoeveel er naar binnen komen. Sales data legt vervolgens vast hoeveel mensen wat kopen. De winkelketen kon vervolgens experimenteren met verschillende etalages om te testen welke het meest succesvol waren. Dit project heeft geleid tot fors meer omzet, en het sluiten van één worstelend filiaal waar onvoldoende mensen langs bleken te komen.  

    Conclusie

    De Big Data revolutie maakt de wereld in rap tempo slimmer. Voor bedrijven is de uitdaging dat deze revolutie plaatsvindt naast de ‘business as usual’. Er is nog veel te doen voordat de meeste ondernemingen in staat zijn echt te profiteren van Big Data Analytics. Het gros van de organisaties is al blij dat ze op een goede manier kunnen rapporteren en analyseren. Veel bedrijven moeten nog aan het experiment beginnen, iets waarbij ze mogelijk over hun koudwatervrees heen moeten stappen. Het is in ieder geval zeker dat er nu snel heel veel kansen zullen ontstaan. De race die nu begonnen is zal uitwijzen wie er met de nieuwe inzichten aan de haal gaan. 

    Auteur: Jeppe Kleyngeld

    Bron: FMI

                

  • Big data and the future of the self-driving car

    Big data and the future of the self-driving car

    Each year, car manufacturers get closer to successfully developing a fully autonomous vehicle. Over the last several years, major tech companies have paired up with car manufacturers to develop the advanced technology that will one day allow the majority of vehicles on the road to be autonomous. Of the five levels of automation, companies like Ford and Tesla are hovering around level three, which offers several autonomous driving functions but still requires a person to be attentive behind the wheel.

    However, car manufacturers are expected to release fully automatic vehicles to the public within the next decade. These vehicles are expected to have a large number of safety and environmental benefits. Self-driving technology has come a long way over the last few years, as the growth of big data in technology industries has helped provide car manufacturers with the programming data needed to get closer to fully automating cars. Big data is helping to install enough information and deep learning in autonomous cars to make them safer for all drivers.

    History of self-driving cars

    The first major automation in cars was cruise control, which was patented in 1950 and is used by most drivers to keep their speed steady during long drives nowadays. Most modern cars already have several automated functions, like proximity warnings and steering adjustment, which have been tried and tested, and proven to be valuable features for safe driving. These technologies use sensors to alert the driver when they are coming too close to something that may be out of the driver’s view or something that the driver may simply not have noticed.

    The fewer functions drivers have to worry about and pay attention to, the more they’re able to focus on the road in front of them and stay alert to dangerous circumstances that could occur at any moment. Human error causes 90 percent of all crashes on the roads, which is one of the main reasons so many industries support the development of autonomous vehicles. However, even when a driver is completely attentive, circumstances that are out of their control could cause them to go off the road or crash into other vehicles. Car manufacturers are still working on the programming for autonomous driving in weather that is less than ideal.

    Big data’s role in autonomous vehicle development

    Although these technologies provided small steps toward automation, they remained milestones away from a fully automated vehicle. However, over the last decade, with the large range of advancements that have been made in technology and the newfound use of big data, tech companies have discovered the necessary programming for fully automating vehicles. Autonomous vehicles rely entirely on the data they receive through GPS, radar and sensor technology, and the information they process through cameras.

    The information cars receive through these sources provides them with the data needed to make safe driving decisions. Although car manufacturers are still using stores of big data to work out the kinks of the thousands of scenarios an autonomous car could find itself in, it’s only a matter of time before self-driving cars transform the automotive industry by making up the majority of cars on the road. As the price of the advanced radars for these vehicles goes down, self-driving cars should become more accessible to the public, which will increase the safety of roads around the world.

    Big data is changing industries worldwide, and deep learning is contributing to the progress towards fully autonomous vehicles. Although it will still be several decades before the mass adoption of self-driving cars, the change will slowly but surely come. In only a few decades, we’ll likely be living in a time where cars are a safer form of transportation, and accidents are tragedies that are few and far between.

    Source: Insidebigdata

  • Big data can’t bring objectivity to a subjective world

    justiceIt seems everyone is interested in big data these days. From social scientists to advertisers, professionals from all walks of life are singing the praises of 21st-century data science.
     
    In the social sciences, many scholars apparently believe it will lend their subject a previously elusive objectivity and clarity. Sociology books like An End to the Crisis of Empirical Sociology? and work from bestselling authors are now talking about the superiority of “Dataism” over other ways of understanding humanity. Professionals are stumbling over themselves to line up and proclaim that big data analytics will enable people to finally see themselves clearly through their own fog.
     
    However, when it comes to the social sciences, big data is a false idol. In contrast to its use in the hard sciences, the application of big data to the social, political and economic realms won’t make these area much clearer or more certain.
     
    Yes, it might allow for the processing of a greater volume of raw information, but it will do little or nothing to alter the inherent subjectivity of the concepts used to divide this information into objects and relations. That’s because these concepts — be they the idea of a “war” or even that of an “adult” — are essentially constructs, contrivances liable to change their definitions with every change to the societies and groups who propagate them.
     
    This might not be news to those already familiar with the social sciences, yet there are nonetheless some people who seem to believe that the simple injection of big data into these “sciences” should somehow make them less subjective, if not objective. This was made plain by a recent article published in the September 30 issue of Science.
     
    Authored by researchers from the likes of Virginia Tech and Harvard, “Growing pains for global monitoring of societal events” showed just how off the mark is the assumption that big data will bring exactitude to the large-scale study of civilization.
     
    The systematic recording of masses of data alone won’t be enough to ensure the reproducibility and objectivity of social studies.
    More precisely, it reported on the workings of four systems used to build supposedly comprehensive databases of significant events: Lockheed Martin’s International Crisis Early Warning System (ICEWS), Georgetown University’s Global Data on Events Language and Tone (GDELT), the University of Illinois’ Social, Political, and Economic Event Database (SPEED) and the Gold Standard Report (GSR) maintained by the not-for-profit MITRE Corporation.
     
    Its authors tested the “reliability” of these systems by measuring the extent to which they registered the same protests in Latin America. If they or anyone else were hoping for a high degree of duplication, they were sorely disappointed, because they found that the records of ICEWS and SPEED, for example, overlapped on only 10.3 percent of these protests. Similarly, GDELT and ICEWS hardly ever agreed on the same events, suggesting that, far from offering a complete and authoritative representation of the world, these systems are as partial and fallible as the humans who designed them.
     
    Even more discouraging was the paper’s examination of the “validity” of the four systems. For this test, its authors simply checked whether the reported protests actually occurred. Here, they discovered that 79 percent of GDELT’s recorded events had never happened, and that ICEWS had gone so far as entering the same protests more than once. In both cases, the respective systems had essentially identified occurrences that had never, in fact, occurred.
     
    They had mined troves and troves of news articles with the aim of creating a definitive record of what had happened in Latin America protest-wise, but in the process they’d attributed the concept “protest” to things that — as far as the researchers could tell — weren’t protests.
     
    For the most part, the researchers in question put this unreliability and inaccuracy down to how “Automated systems can misclassify words.” They concluded that the examined systems had an inability to notice when a word they associated with protests was being used in a secondary sense unrelated to political demonstrations. As such, they classified as protests events in which someone “protested” to her neighbor about an overgrown hedge, or in which someone “demonstrated” the latest gadget. They operated according to a set of rules that were much too rigid, and as a result they failed to make the kinds of distinctions we take for granted.
     
    As plausible as this explanation is, it misses the more fundamental reason as to why the systems failed on both the reliability and validity fronts. That is, it misses the fact that definitions of what constitutes a “protest” or any other social event are necessarily fluid and vague. They change from person to person and from society to society. Hence, the systems failed so abjectly to agree on the same protests, since their parameters on what is or isn’t a political demonstration were set differently from each other by their operators.
     
    Make no mistake, the basic reason as to why they were set differently from each other was not because there were various technical flaws in their coding, but because people often differ on social categories. To take a blunt example, what may be the systematic genocide of Armenians for some can be unsystematic wartime killings for others. This is why no amount of fine-tuning would ever make such databases as GDELT and ICEWS significantly less fallible, at least not without going to the extreme step of enforcing a single worldview on the people who engineer them.
     
    It’s unlikely that big data will bring about a fundamental change to the study of people and society.
    Much the same could be said for the systems’ shortcomings in the validity department. While the paper’s authors stated that the fabrication of nonexistent protests was the result of the misclassification of words, and that what’s needed is “more reliable event data,” the deeper issue is the inevitable variation in how people classify these words themselves.
     
    It’s because of this variation that, even if big data researchers make their systems better able to recognize subtleties of meaning, these systems will still produce results with which other researchers find issue. Once again, this is because a system might perform a very good job of classifying newspaper stories according to how one group of people might classify them, but not according to how another would classify them.
     
    In other words, the systematic recording of masses of data alone won’t be enough to ensure the reproducibility and objectivity of social studies, because these studies need to use often controversial social concepts to make their data significant. They use them to organize “raw” data into objects, categories and events, and in doing so they infect even the most “reliable event data” with their partiality and subjectivity.
     
    What’s more, the implications of this weakness extend far beyond the social sciences. There are some, for instance, who think that big data will “revolutionize” advertising and marketing, allowing these two interlinked fields to reach their “ultimate goal: targeting personalized ads to the right person at the right time.” According to figures in the advertising industry “[t]here is a spectacular change occurring,” as masses of data enable firms to profile people and know who they are, down to the smallest preference.
     
    Yet even if big data might enable advertisers to collect more info on any given customer, this won’t remove the need for such info to be interpreted by models, concepts and theories on what people want and why they want it. And because these things are still necessary, and because they’re ultimately informed by the societies and interests out of which they emerge, they maintain the scope for error and disagreement.
     
    Advertisers aren’t the only ones who’ll see certain things (e.g. people, demographics, tastes) that aren’t seen by their peers.
     
    If you ask the likes of Professor Sandy Pentland from MIT, big data will be applied to everything social, and as such will “end up reinventing what it means to have a human society.” Because it provides “information about people’s behavior instead of information about their beliefs,” it will allow us to “really understand the systems that make our technological society” and allow us to “make our future social systems stable and safe.”
     
    That’s a fairly grandiose ambition, yet the possibility of these realizations will be undermined by the inescapable need to conceptualize information about behavior using the very beliefs Pentland hopes to remove from the equation. When it comes to determining what kinds of objects and events his collected data are meant to represent, there will always be the need for us to employ our subjective, biased and partial social constructs.
     
    Consequently, it’s unlikely that big data will bring about a fundamental change to the study of people and society. It will admittedly improve the relative reliability of sociological, political and economic models, yet since these models rest on socially and politically interested theories, this improvement will be a matter of degree rather than kind. The potential for divergence between separate models won’t be erased, and so, no matter how accurate one model becomes relative to the preconceptions that birthed it, there will always remain the likelihood that it will clash with others.
     
    So there’s little chance of a big data revolution in the humanities, only the continued evolution of the field.
  • Big data defeats dengue

    mosquito-aedes-albopictusNumbers have always intrigued Wilson Chua, a big data analyst hailing from Dagupan, Pangasinan and currently residing in Singapore. An accountant by training, he crunches numbers for a living, practically eats them for breakfast, and scans through rows and rows of excel files like a madman.
     
    About 30 years ago, just when computer science was beginning to take off, Wilson stumbled upon the idea of big data. And then he swiftly fell in love. He came across the story of John Snow, the English physician who solved the cholera outbreak in London in 1854, which fascinated him with the idea even further. “You can say he’s one of the first to use data analysis to come out with insight,” he says.
     
    In 1850s-London, everybody thought cholera was airborne. Nobody had any inkling, not one entertained the possibility that the sickness was spread through water. “And so what John Snow did was, he went door to door and made a survey. He plotted the survey scores and out came a cluster that centered around Broad Street in the Soho District of London.
     
    “In the middle of Broad Street was a water pump. Some of you already know the story, but to summarize it even further, he took the lever of the water pump so nobody could extract water from that anymore. The next day,” he pauses for effect, “no cholera.”
     
    The story had stuck with him ever since, but never did he think he could do something similar. For Wilson, it was just amazing how making sense of numbers saved lives.
     
    A litany of data
     
    In 2015 the province of Pangasinan, from where Wilson hails, struggled with rising cases of dengue fever. There were enough dengue infections in the province—2,940 cases were reported in the first nine months of 2015 alone—for it to be considered an epidemic, had Pangasinan chosen to declare it.
     
    Wilson sat comfortably away in Singapore while all this was happening. But when two of his employees caught the bug—he had business interests in Dagupan—the dengue outbreak suddenly became a personal concern. It became his problem to solve.
     
    “I don’t know if Pangasinan had the highest number of dengue cases in the Philippines,” he begins, “but it was my home province so my interests lay there,” he says. He learned from the initial data released by the government that Dagupan had the highest incident of all of Pangasinan. Wilson, remembering John Snow, wanted to dig deeper.
     
    Using his credentials as a technology writer for Manila Bulletin, he wrote the Philippine Integrated Diseases Surveillance and Response team (PIDSR) of the Department of Health, requesting for three years worth of data on Pangasinan.
     
    The DOH acquiesced and sent him back a litany of data on an Excel sheet: 81,000 rows of numbers or around 27,000 rows of data per year. It’s an intimidating number but one “that can fit in a hard disk,” Wilson says.
     
    He then set out to work. Using tools that converted massive data into understandable patterns—graphs, charts, the like—he looked for two things: When dengue infections spiked and where those spikes happened.
     
    “We first determined that dengue was highly related to the rainy season. It struck Pangasinan between August and November,” Wilson narrates. “And then we drilled down the data to uncover the locations, which specific barangays were hardest hit.”
     
    The Bonuan district of the city of Dagupan, which covers the barangays of Bonuan Gueset, Bonuan Boquig, and Bonuan Binloc, accounted for a whopping 29.55 percent—a third of all the cases in Dagupan for the year 2015.
     
    The charts showed that among the 30 barangays, Bonuan Gueset was number 1 in all three years. “It means to me that Bonuan Gueset was the ground zero, the focus of infection.”
     
    But here’s the cool thing: After running the data on analytics, Wilson learned that the PIDS sent more than they had hoped for. They also included the age of those affected. According to the data, dengue in Bonuan was prevalent among school children aged 5-15 years old.
     
    “Now given the background of Aedes aegypti, the dengue-carrying mosquito—they bite after sunrise and a few hours before sunset. So it’s easily to can surmise that the kids were bitten while in school.”
     
    It excited him so much he fired up Google Maps and switched it to satellite image. Starting with Barangay Bonuan Boquig, he looked for places that had schools that had stagnant pools of water nearby. “Lo and behold, we found it,” he says.
     
    Sitting smack in the middle of Lomboy Elementary School and Bonuan Boquig National High School were large pools of stagnant water.
    Like hitting jackpot, Wilson quickly posted his findings on Facebook, hoping someone would take up the information and make something out of it. Two people hit him up immediately: Professor Nicanor Melecio, the project director of the e-Smart Operation Center of Dagupan City Government, and Wesley Rosario, director at the Bureau of Fisheries and Aquatic Resources, a fellow Dagupeño.
     
    A social network
     
    Unbeknownst to Wilson, back in Dagupan, the good professor had been busy, conducting studies on his own. The e-Smart Center, tasked with crisis, flooding, disaster-type of situation, had been looking into the district’s topography vis-a-vis rainfall in Bonuan district. “We wanted to detect the catch basins of the rainfall,” he says, “the elevation of the area, the landscape. Basically, we wanted to know the deeper areas where rainfall could possibly stagnate.”
     
    Like teenage boys, the two excitedly messaged each other on Facebook. “Professor Nick had lieder maps of Dagupan, and when he showed me those, it confirmed that these areas, where we see the stagnant water, during rainfall, are those very areas that would accumulate rainfall without exit points,” Wilson says. With no sewage system, the water just sat there and accumulated.
     
    With Wilson still operating remotely in Singapore, Professor Melecio took it upon himself to do the necessary fieldwork. He went to the sites, scooped up water from the stagnant pools, and confirmed they were infested with kiti-kiti or wriggling mosquito larvae.
     
    Professor Melecio quickly coordinated with Bonuan Boquig Barangay Captain Joseph Maramba to involve the local government of Bonuan Boquig on their plan to conduct vector control measures.
     
    A one-two punch
     
    Back in Singapore, Wilson found inspiration from the Tiger City’s solution to its own mosquito problem. “They used mosquito dunks that contained BTI, the bacteria that infects mosquitoes and kills its eggs,” he says.
     
    He used his own money to buy a few of those dunks, imported them to Dagupan, and on Oct. 6, had his team scatter them around the stagnant pools of Bonuan Boquig. The solution was great, dream-like even, except it had a validity period. Beyond 30 days, the bacteria is useless.
     
    Before he even had a chance to even worry about the solution’s sustainability, BFAR director Wesley Rosario pinged him on Facebook saying the department had 500 mosquito fish for disposal. “Would we want to send somebody to his office, get the fish, and release them into the pools?”
     
    The Gambezi earned its nickname because it eats, among other things, mosquito larvae. In Wilson’s and Wesley’s mind, the mosquito fish can easily make a home out of the stagnant pools and feast on the very many eggs present. When the dry season comes, the fish will be left to die. Except, here’s the catch: mosquito fish is edible.
     
    “The mosquito fish solution was met with a few detractors,” Wilson admits. “There are those who say every time you introduce a new species, it might become invasive. But it’s not really new as it is already endemic to the Philippines. Besides we are releasing them in a landlocked area, so wala namang ibang ma-a-apektuhan.”
     
    The critics, however, were silenced quickly. Four days after deploying the fish, the mosquito larvae were either eaten or dead. Twenty days into the experiment, with the one-two punch of the dunks and the fish, Barangay Boquig reported no new infections of dengue.
     
    “You know, we were really only expecting the infections to drop 50 percent,” Wilson says, rather pleased. More than 30 days into the study and Barangay Bonuan Boquig still has no reports of new cases. “We’re floored,” he added.
     
    At the moment, nearby barangays are already replicating what Wilson, Professor Melecio, and Wesley Rosario have done with Bonuan Boquig. Michelle Lioanag of the non-profit Inner Wheel Club of Dagupan has already taken up the cause to do the same for Bonuan Gueset, the ground zero for dengue in Dagupan.
     
    According to Wilson, what they did in Bonuan Boquig is just a proof of concept, a cheap demonstration of what big data can do. “It was so easy to do,” he said. “Everything went smoothly,” adding all it needed was cooperative and open-minded community leaders who had nothing more than sincere public service in their agenda.
     
    “You know, big data is multi-domain and multi-functional. We can use it for a lot of industries, like traffic for example. I was talking with the country manager of Waze…” he fires off rapidly, excited at what else his big data can solve next.
     
    Source: news.mb.com, November 21, 2016
  • Big Data Experiment Tests Central Banking Assumptions

    centrale bank van nederland(Bloomberg) -- Central bankers may do well to pay less attention to the bond market and their own forecasts than they do to newspaper articles.That’s the somewhat heretical finding of a new algorithm-based index being tested at Norway’s central bank in Oslo. Researchers fed 26 years of news (or 459,745 news articles) from local business daily Dagens Naringsliv into a macroeconomic model to create a “newsy coincident index of business cycles” to help it gauge the state of the economy.

    Leif-Anders Thorsrud, a senior researcher at the bank who started the project while getting his Ph.D. at the Norwegian Business School, says the “hypothesis is quite simple: the more that is written on a subject at a time, the more important the subject could be.”

    He’s already working on a new paper (yet to be published) showing it’s possible to make trades on the information. According to Thorsrud, the work is part of a broader “big data revolution.”

    Big data and algorithms have become buzzwords for hedge funds and researchers looking for an analytical edge when reading economic and political trends. For central bankers, the research could provide precious input to help them steer policy through an unprecedented era of monetary stimulus, with history potentially a serving as a poor guide in predicting outcomes.

    At Norway’s central bank, researchers have found a close correlation between news and economic developments. Their index also gives a day-to-day picture of how the economy is performing, and do so earlier than lagging macroeconomic data.

    But even more importantly, big data can be used to predict where the economy is heading, beating the central bank’s own forecasts by about 10 percent, according to Thorsrud. The index also showed it was a better predictor of the recession in the early 2000s than market indicators such as stocks or bonds.

    The central bank has hired machines, which pore daily through articles from Dagens Naringsliv and divide current affairs into topics and into words with either positive or negative connotations. The data is then fed into a macroeconomic model employed by the central bank, which spits out a proxy of GDP.

    Thorsrud says the results of the index are definitely “policy relevant,” though it’s up to the operative policy makers whether they will start using the information. Other central bank such as the Bank of England are looking at similar tools, he said.

    While still in an experimental stage, the bank has set aside more resources to continue the research, Thorsrud said. “In time this could be a useful in the operative part of the bank.”

    Bron: Informatie Management
  • Big Data gaat onze zorg verbeteren

    Hij is een man met een missie. En geen geringe: hij wil samen met patiënten, de zorgverleners en verzekeraars een omslag in de gezondheidszorg bewerkstelligen, waarbij de focus verlegd wordt van het managen van ziekte naar het managen van gezondheid. Jeroen Tas, CEO Philips Connected Care & Health Informatics, over de toekomst van de zorg.

    big-data-healthcare-2Wat is er mis met het huidige systeem?

    “In de ontwikkelde wereld wordt gemiddeld 80 procent van het budget voor zorg besteed aan het behandelen van chronische ziektes, zoals hart- en vaatziektes, longziektes, diabetes en verschillende vormen van kanker. Slechts 3 procent van dat budget wordt besteed aan preventie, aan het voorkomen van die ziektes. Terwijl we weten dat 80 procent van hart- en vaatziekten, 90 procent van diabetes type 2 en 50 procent van kanker te voorkomen zijn. Daarbij spelen sociaaleconomische factoren mee, maar ook voeding, wel of niet roken en drinken, hoeveel beweging je dagelijks krijgt en of je medicatie goed gebruikt. We sturen dus met het huidige systeem lang niet altijd op op de juiste drivers om de gezondheid van mensen te bevorderen en hun leven daarmee beter te maken. 50 procent van de patiënten neemt hun medicatie niet of niet op tijd in. Daar liggen mogelijkheden voor verbetering.”

    Dat systeem bestaat al jaren - waarom is het juist nu een probleem?
    “De redenen zijn denk ik alom bekend. In veel landen, waaronder Nederland, vergrijst de bevolking en neemt daarmee het aantal chronisch zieken toe, en dus ook de druk op de zorg. Daarbij verandert ook de houding van de burger ten aanzien van zorg: beter toegankelijk, geïntegreerd en 24/7, dat zijn de grote wensen. Tot slot nemen de technologische mogelijkheden sterk toe. Mensen kunnen en willen steeds vaker zelf actieve rol spelen in hun gezondheid: zelfmeting, persoonlijke informatie en terugkoppeling over voortgang. Met Big Data zijn we nu voor het eerst in staat om grote hoeveelheden data snel te analyseren, om daarin patronen te ontdekken en meer te weten te komen over ziektes voorspellen en voorkomen. Kortom, we leven in een tijd waarin er binnen korte tijd heel veel kan en gaat veranderen. Dan is het belangrijk om op de juiste koers te sturen.”

    Wat moet er volgens jou veranderen?
    “De zorg is nog steeds ingericht rond (acute) gebeurtenissen. Gezondheid is echter een continu proces en begint met gezond leven en preventie. Als mensen toch ziek worden, volgt er diagnose en behandeling. Vervolgens worden mensen beter, maar hebben ze misschien nog wel thuis ondersteuning nodig. En hoop je dat ze weer verder gaan met gezond leven. Als verslechtering optreedt is tijdige interventie wenselijk. De focus van ons huidige systeem ligt vrijwel volledig op diagnose en behandeling. Daarop is ook het vergoedingssysteem gericht: een radioloog wordt niet afgerekend op zijn bijdrage aan de behandeling van een patiënt maar op de hoeveelheid beelden die hij maakt en beoordeelt. Terwijl we weten dat er heel veel winst in termen van tijd, welzijn en geld te behalen valt als we juist meer op gezond leven en preventie focussen. 

    Er moeten ook veel meer verbanden komen tussen de verschillende pijlers in het systeem en terugkoppeling over de effectiviteit van diagnose en behandeling. Dat kan bijvoorbeeld door het delen van informatie te stimuleren. Als een cardioloog meer gegevens heeft over de thuissituatie van een patiënt, bijvoorbeeld over hoe hij zijn medicatie inneemt, eet en beweegt, dan kan hij een veel beter behandelplan opstellen, toegesneden op de specifieke situatie van de patiënt. Als de thuiszorg na behandeling van die patiënt ook de beschikking heeft over zijn data, weet men waarop er extra gelet moet worden voor optimaal herstel. En last maar zeker not least, de patiënt moet ook over die data beschikken, om zo gezond mogelijk te blijven. Zo ontstaat een patiëntgericht systeem gericht op een optimale gezondheid.”

    Dat klinkt heel logisch. Waarom gebeurt het dan nog niet?
    “Alle verandering is lastig – en zeker verandering in een sector als de zorg, die om begrijpelijke redenen conservatief is en waarin er complexe processen spelen. Het is geen kwestie van technologie: alle technologie die we nodig hebben om de omslag tot stand te brengen, is er. We hebben sensoren om data automatisch te generen, die in de omgeving van de patiënt kunnen worden geïnstalleerd, die hij kan dragen – denk aan een Smarthorloge – en die zelfs in zijn lichaam kunnen zitten, in het geval van slimme geneesmiddelen. Daarmee komt de mens centraal te staan in het systeem, en dat is waar we naartoe willen.
    Er moet een zorgnetwork om ieder persoon komen, waarin onderling data wordt gedeeld ten behoeve van de persoonlijke gezondheid. Dankzij de technologie kunnen veel behandelingen ook op afstand gebeuren, via eHealth oplossingen. Dat is veelal sneller en vooral efficiënter dan mensen standaard doorsturen naar het ziekenhuis. Denk aan thuismonitoring, een draagbaar echo apparaat bij de huisarts of beeldbellen met een zorgverlener. We kunnen overigens al hartslag, ademhaling en SPo2 meten van een videobeeld. 

    De technologie is er. We moeten het alleen nog combineren, integreren en vooral: implementeren. Implementatie hangt af van de bereidheid van alle betrokkenen om het juiste vergoedingsstelsel en samenwerkingsverband te vinden: overheid, zorgverzekeraars, ziekenhuis, artsen, zorgverleners en de patiënt zelf. Daarover ben ik overigens wel positief gestemd: ik zie de houding langzaam maar zeker veranderen. Er is steeds meer bereidheid om te veranderen.”

    Is die bereidheid de enige beperkende factor?
    “We moeten ook een aantal zaken regelen op het gebied van data. Data moet zonder belemmeringen kunnen worden uitgewisseld, zodat alle gegevens van een patiënt altijd en overal beschikbaar zijn. Dat betekent uiteraard ook dat we ervoor moeten zorgen dat die gegevens goed beveiligd zijn. We moeten ervoor zorgen dat we dat blijvend kunnen garanderen. En tot slot moeten we werken aan het vertrouwen dat nodig is om gegevens te standaardiseren en te delen, bij zorgverleners en vooral bij de patiënt.Dat klinkt heel zwaar en ingewikkeld maar we hebben het eerder gedaan. Als iemand je twintig jaar geleden had verteld dat je via internet al je bankzaken zou regelen, zou je hem voor gek hebben versleten: veel te onveilig. Inmiddels doen we vrijwel niet anders.
    De shift in de zorg nu vraagt net als de shift in de financiële wereld toen om een andere mindset. De urgentie is er, de technologie is er, de bereidheid ook steeds meer – daarom zie ik de toekomst van de zorg heel positief in.”

     Bron: NRC
  • Business Data Scientist 2.0

    Ruim 3 jaar geleden verzorgden we de eerste leergang Business Data Scientist. Getriggerd door de vele sexy vacature teksten vroegen we ons als docenten af wat een data scientist nu exact tot data scientist maakt? In de vacatureteksten viel ons naast een enorme variëteit ook een waslijst aan noodzakelijke competenties op. De associatie met het (meestal) denkbeeldige schaap met de vijf poten was snel gelegd. Daarnaast sprak uit die vacatureteksten in 2014 vooral hoop en ambitie. Bedrijven met hoge verwachtingen op zoek naar deskundig personeel om de alsmaar groter wordende stroom data te raffineren tot waarde voor de onderneming. Wat komt daar allemaal bij kijken?

    Een aantal jaar en 7 leergangen later is er veel veranderd. Maar eigenlijk ook weer weinig. De verwachtingen van bedrijven zijn nog steeds torenhoog. De data scientist komt voor in alle vormen en gedaanten. Dat lijkt geaccepteerd. Maar de kern: hoe data tot waarde te brengen en wat daarbij komt kijken blijft onderbelicht. De relevantie voor een opleiding Business Data Scientist is dus onveranderd. En eigenlijk groter geworden. De investeringen in data science zijn door veel bedrijven gedaan. Het wordt tijd om te oogsten.Data scientist 2.0

    Om data tot waarde te kunnen brengen is ‘verbinding’ noodzakelijk. Verbinding tussen de hard core data scientists die data als olie kunnen opboren, raffineren tot informatie en het volgens specificaties kunnen opleveren aan de ene kant. En de business mensen met hun uitdagingen aan de andere kant. In onze leergangen hebben we veel verhalen gehoord van mooie dataprojecten die paarlen voor de zwijnen bleken vanwege onvoldoende verbinding. Hoe belangrijk ook, zonder die verbinding overleeft de data scientist niet. De relevantie van een leergang Business Data Scientist is dus onveranderd. Moet iedere data scientist deze volgen? Bestaat er een functie business data scientist? Beide vragen kunnen volmondig met néé beantwoord worden. Wil je echter op het raakvlak van toepassing en data science opereren dan zit je bij deze leergang precies goed. En dat raakvlak zal meer en meer centraal gaan staan in data intensieve organisaties.

    De business data scientist is iemand die als geen ander weet dat de waarde van data zit in het uiteindelijk gebruik. Vanuit dat eenvoudig uitgangspunt definieert, begeleidt, stuurt hij/zij data projecten in organisaties. Hij denkt mee over de structurele verankering van het gebruik van data science in de operationele en beleidsmatige processen van organisatie en komt met inrichtingsvoorstellen. De business data scientist kent de data science gereedschapskist door en door zonder ieder daarin aanwezige instrument ook daadwerkelijk zelf te kunnen gebruiken. Hij of zij weet echter welk stukje techniek voor welk type probleem moet worden ingezet. En omgekeerd is hij of zij in staat bedrijfsproblemen te typeren en classificeren zodanig dat de juiste technologieën en expertises kunnen worden geselecteerd. De business data scientist begrijpt informatieprocessen, kent de tool box van data science en weet zich handig te bewegen in het domein van de belangen die altijd met projecten zijn gemoeid.

    De BDS leergang is relevant voor productmanagers en marketeers die data intensiever willen gaan werken, voor hard core data scientists die de verbinding willen leggen met de toepassing in hun organisatie en voor (project)managers die verantwoordelijk zijn voor het functioneren van data scientists.

    De leergang BDS 2.0 wordt gekenmerkt door een actie gerichte manier van leren. Gebaseerd op een theoretisch framework dat tot doel heeft om naar de tool box van data science te kijken vanuit het oogpunt van business value staan cases centraal. In die cases worden alle fasen van het tot waarde brengen van data belicht. Van de projectdefinitie via de data analyse en de business analytics naar het daadwerkelijk gebruik. En voor alle relevante fasen leveren specialisten een deep dive. Ben je geïnteresseerd in de leergang. Download dan hier de brochure. http://www.ru.nl/rma/leergangen/bds/

    Egbert Philips  

    Docent BDS leergang Radboud Management Academy

    Director Hammer, market intelligence   www.Hammer-intel.com

     

  • Business Intelligence Trends for 2017

    businessintelligence 5829945be5abcAnalyst and consulting firm, Business Application Research Centre (BARC), has come out with the top BI trends based on a survey carried out on 2800 BI professionals. Compared to last year, there were no significant changes in the ranking of the importance of BI trends, indicating that no major market shifts or disruptions are expected to impact this sector.
     
    With the growing advancement and disruptions in IT, the eight meta trends that influence and affect the strategies, investments and operations of enterprises, worldwide, are Digitalization, Consumerization, Agility, Security, Analytics, Cloud, Mobile and Artificial Intelligence. All these meta trends are major drivers for the growing demand for data management, business intelligence and analytics (BI). Their growth would also specify the trend for this industry.The top three trends out of 21 trends for 2017 were:
    • Data discovery and visualization,
    • Self-service BI and
    • Data quality and master data management
    • Data labs and data science, cloud BI and data as a product were the least important trends for 2017.
    Data discovery and visualization, along with predictive analytics, are some of the most desired BI functions that users want in a self-service mode. But the report suggested that organizations should also have an underlying tool and data governance framework to ensure control over data.
     
    In 2016, BI was majorly used in the finance department followed by management and sales and there was a very slight variation in their usage rates in that last 3 years. But, there was a surge in BI usage in production and operations departments which grew from 20% in 2008 to 53% in 2016.
     
    "While BI has always been strong in sales and finance, production and operations departments have traditionally been more cautious about adopting it,” says Carsten Bange, CEO of BARC. “But with the general trend for using data to support decision-making, this has all changed. Technology for areas such as event processing and real-time data integration and visualization has become more widely available in recent years. Also, the wave of big data from the Internet of Things and the Industrial Internet has increased awareness and demand for analytics, and will likely continue to drive further BI usage in production and operations."
     
    Customer analysis was the #1 investment area for new BI projects with 40% respondents investing their BI budgets on customer behavior analysis and 32% on developing a unified view of customers.
    • “With areas such as accounting and finance more or less under control, companies are moving to other areas of the enterprise, in particular to gain a better understanding of customer, market and competitive dynamics,” said Carsten Bange.
    • Many BI trends in the past, have become critical BI components in the present.
    • Many organizations were also considering trends like collaboration and sensor data analysis as critical BI components. About 20% respondents were already using BI trends like collaboration and spatial/location analysis.
    • About 12% were using cloud BI and more were planning to employ it in the future. IBM's Watson and Salesforce's Einstein are gearing to meet this growth.
    • Only 10% of the respondents used social media analysis.
    • Sensor data analysis is also growing driven by the huge volumes of data generated by the millions of IoT devices being used by telecom, utilities and transportation industries. According to the survey, in 2017, the transport and telecoms industries would lead the leveraging of sensor data.
    The biggest new investments in BI are planned in the manufacturing and utilities industries in 2017.
     
    Source: readitquick.com, November 14, 2016
  • Creating a single view of the customer with the help of data

    Data has key contributions for creating a single view of the customer that can be used to improve your business and better understand those involved in the market you serve. In order to thrive in the current economic environment, businesses need to know their customers very well in order to provide exceptional customer service. To do so, they must be able to rapidly understand and react to customer shopping behaviors. To properly interpret and react to customer behaviors, businesses need a complete, single view of their customers. What does that mean? A Single View of the Customer (SVC) allows a business to analyze and visualize all the relevant information surrounding their customers, such as transactions, products, demographics, marketing, discounting, etc. Unfortunately, the IT systems that are typically found in mid-to large-scale companies have separated much of the relevant customer data into individual systems. Marketing information, transaction data, website analytics, customer profiles, shipping information, product information, etc. are often kept in different data repositories. This makes the implementation of SVC potentially challenging. First of all, let’s examine two scenarios where a company’s ability to fuse all these external sources into an SVC can provide tremendous value. Afterwards, we will pay attention to strategies for implementing an SVC.

    Call center

    Customer satisfaction has a major impact on the bottom line of any business. Studies found that 78 percent of customers have failed to complete a transaction because of poor customer service. A company’s call center is a key part ofmaintaining customer satisfaction. Customers interacting with the call center typically do so because there is already a problem – a missing order, a damaged or incorrect product. It is critical that the call center employee can resolve the problem as quickly as possible. Unfortunately, due to the typical IT infrastructure discussed above, the customer service representative often faces a significant challenge. For example, in order to locate the customer’s order, access shipping information and the potential to find a replacement product to ship, the customer service representative may have to log in to three or more different systems (often, the number is significantly higher). Every login to a new system increases the time the customer has to wait, decreasing their satisfaction, and every additional system adds to the probability of diturbances or even failure.

    Sales performance

    Naturally, in order to maximize revenue, it is critical to understand numerous key metrics including, but not limited to:

    • What products are/are not doing well?
    • What products are being purchased by your key customer demographic groups?
    • What items are being purchased together (basket analysis)?
    • Are there stores with inventory problems (overstocked/understocked)?

    Once again, the plethora of data storage systems poses a challenge. To gather the data required to perform the necessary analytics, numerous systems will have to be queried. As with the call center scenario, this is often performed manually, via “swivel-chair integration.” This means that an analyst will have to manually login to each system, execute a query to get the necessary data, store that data in a temporary location (often in Microsoft Excel™), and then repeat that process for each independent data store. Finally, once the data is gathered the process of performing the actual analysis can begin. The process of gathering the necessary data often takes longer than the actual analysis. Even in medium-sized companies this process can involve numerous people to get the analysis done as quickly as possible. Still, the manual nature of this process means that it is not only expensive to perform (in terms of resources), but it occurs at a much slower pace than ideal to rapidly make business decisions. The fact that performing even the most basic and critical analytics is so expensive and time consuming often prevents companies from taking the next steps, even though those steps could turn out critical to the business' sales. One of those potential next steps is moving the full picture of the customer directly into the stores. When sales associates can immediately access customer information, they are able to provide a more personalized customer experience, which is likely to increase customer satisfaction and average revenue per sale. Another opportunity where the company can see tremendous sales impact is in moving from reactive analytics to predictive analytics. When a company runs traditional retail metrics – as previously described – they are typically either done as part of a regular reporting cycle or in response to an event, such as a sale, in order to understand the impact of that event on the business. While no one is likely to dispute the value of those analytics, the fact is that the company is now merely reacting to events that have already happened. We can try to use some sort of advanced analytic methods to predict how our customers may behave in the future based on their past actions, but as we so often hear from financial analysts, past performance is not indicative of future results. However, if we can take our SVC, that links together all of their past actions, and tie in information about what the customer intends to do in the future (like Prosper Insight Analytics', we now have a roadmap of customer intent that we can use to make key business decisions.

    Creating and implementing a Single View of the Customer

    To be effective, an implementation of SVC must be driven from a unified data fabric. Attempting to create an SVC by directly connecting the application to each of the necessary data sources will be extremely time consuming and will prove highly challenging to implement. A data fabric can connect the necessary data to provide an operational analytic environment upon which to base the SVC. The data fabric driving the SVC should meet the following requirements:

    • It must connect all the relevant data from the existing data sources in a central data store. This provides the ability to perform the complex analytics across all the linked information.
    • It needs to be easily modified to support new data. As the business grows and evolves, learning to leverage its new SVC, additional data sources will typically be identified as being helpful. These should be easy to integrate into the system.
    • The initial implementation must be rapid. Ideally, a no-code solution should be implemented. Companies rarely have the resources to support expensive, multi-month IT efforts.
    • It should not disrupt existing systems. The data fabric should provide an augmented data layer that supports the complex analytic queries required by the SVC without disrupting the day-to-day functionality of the existing data stores or applications.

    Conclusion

    A well-built SVC can have a significant, positive impact on a company’s bottom line. Customer satisfaction can be increased and analysis can move from being reactive to being predictive of customer behavior. This effort will likely require that a data fabric be developed to support the application, but new technologies now make it possible to rapidly create that fabric using no-code solutions, thereby making feasible the deployment of a Single View of the Customer.

    Author: Clark Richey

    Source: Smart Data Collective

  • Dashboard storytelling: The perfect presentation (part 1)

    Dashboard storytelling: The perfect presentation (part 1)

    Plato famously said that “those who tell stories rule society.” This statement is as true today as it was in ancient Greece, perhaps even more so in modern times.

    In the contemporary world of business, the age-old art of storytelling is far from forgotten: rather than speeches on the Senate floor, businesses rely on striking data visualizations to convey information, drive engagement, and persuade audiences.

    By combining the art of storytelling with the technological capabilities of dashboard software, it’s possible to develop powerful, meaningful, data-backed presentations that not only move people but also inspire them to take action or make informed, data-driven decisions that will benefit your business.

    As far back as anyone can remember, narratives have helped us make sense of the sometimes complicated world around us. Rather than just listing facts, figures, and statistics, people used gripping, imaginative timelines, bestowing raw data with real context and interpretation. In turn, this got the attention of listeners, immersing them in the narrative, thereby offering a platform to absorb a series of events in their mind’s eye precisely the way they unfolded.

    Here we explore data-driven, live dashboard storytelling in depth, looking at storytelling with KPIs and the dynamics of a data storytelling presentation while offering real-world storytelling presentation examples.

    First, we’ll delve into the power of data storytelling as well as the general dynamics of a storytelling dashboard and what you can do with your data to deliver a great story to your audience. Moreover, we will offer dashboard storytelling tips and tricks that will help you make your data-driven narrative-building efforts as potent as possible, driving your business into exciting new dimensions. But let’s start with a simple definition.

    “You’re never going to kill storytelling, because it’s built in the human plan. We come with it.” – Margaret Atwood

    What is dashboard storytelling?

    Dashboard storytelling is the process of presenting data in effective visualizations that depict the whole narrative of key performance indicators, business strategies and processes in the form of an interactive dashboard on a single screen, and in real-time. Storytelling is indeed a powerful force, and in the age of information, it’s possible to use the wealth of insights available at your fingertips to communicate your message in a way that is more powerful than you could ever have imagined. So, let's take a look at the top tips and tricks to be able to successfully create your own story with a few clicks.

    4 Tricks to get started with dashboard storytelling

    Big data commands big stories.

    Forward-thinking business people turn to online data analysis and data visualizations to display colossal volumes of content in a few well-designed charts. But these condensed business insights may remain hidden if they aren’t communicated with words in a way that is effective and rewarding to follow. Without language, business people often fail to push their message through to their audience, and as such, fail to make any real impact.

    Marketers, salespeople, and entrepreneurs are today’s storytellers. They are wholly responsible for their data story. People in these roles are often the bridge between their data and the forum of decision-makers they’re looking to encourage to take the desired action.

    Effective dashboard storytelling with data in a business context must be focused on tailoring the timeline to the audience and choosing one of the right data visualization types to complement or even enhance the narrative.

    To demonstrate this notion, let’s look at some practical tips on how to prepare the best story to accompany your data.

    1. Start with data visualization

    This may sound repetitive, but when it comes to a dashboard presentation, or dashboard storytelling presentation, it will form the foundation of your success: you must choose your visualization carefully.

    Different views answer different questions, so it’s vital to take care when choosing how to visualize your story. To help you in this regard, you will need a robust data visualization tool. These intuitive aids in dashboard storytelling are now ubiquitous and provide a wide array of options to choose from, including line charts, bar charts, maps, scatter plots, spider webs, and many more. Such interactive tools are rightly recognized as a more comprehensive option than PowerPoint presentations or endless Excel files.

    These tools help both in exploring the data and visualizing it, enabling you to communicate key insights in a persuasive fashion that results in buy-in from your audience.

    But for optimum effectiveness, we still need more than a computer algorithm.. Here we need a human to present the data in a way that will make it meaningful and valuable. Moreover, this person doesn’t need to be a common presenter or a teacher-like figure. According to research carried out by Stanford University, there are two types of storytelling: author- and reader-driven storytelling.

    An author-driven narrative is static and authoritative because it dictates the analysis process to the reader or listener. It’s like analyzing a chart printed in a newspaper. On the other hand, reader-driven storytelling allows the audience to structure the analysis on their own. Here, the audience can choose the data visualizations that they deem meaningful and interact with them on their own by drilling down to more details or choosing from various KPI examples they want to see visualized. They can reach out for insights that are crucial to them and make sense out of data independently. A different story may need a different type of stoeytelling.

    2. Put your audience first

    Storytelling for a dashboard presentation should always begin with stating your purpose. What is the main takeaway from your data story? It should be clear that your purpose is to motivate the audience to take a certain action.

    Instead of thinking about your business goals, try to envision what your listeners are seeking. Each member of your audience, be that a potential customer, future business partner, or stakeholder, has come to listen to your data storytelling presentation to gain a profit for him or herself. To better meet your audience’s expectations and gain their trust (and money), put their goals first in the determination of the line of your story.

    Needless to say, before your dashboard presentation, try to learn as much as you can about your listeners. Put yourself in their shoes: Who are they? What do they do on a daily basis? What are their needs? What value can they draw from your data for themselves?

    The better you understand your audience, the more they will trust you and follow your idea.

    3. Don’t fill up your data storytelling with empty words

    Storytelling with data, rather than just presenting data visualizations, brings the best results. That said, there are certain enemies of your story that make it more complicated than enlightening and turn your efforts into a waste of time.

    The first things that could cause some trouble are the various technology buzzwords that are devoid of any defined meaning. These words don’t create a clear picture in your listeners’ heads and are useless as a storytelling aid. In addition, to under-informing your audience, buzzwords are a sign of your lazy thinking and a herald that you don’t have anything unique or meaningful to say. Try to add clarity to your story by using more precise and descriptive narratives that truly communicate your purpose.

    Another trap can be the use of your industry jargon to sound more professional. The problem here is that it may not be the jargon of your listeners’ industry, they may not comprehend your narrative. Moreover, some jargon phrases have different meanings depending on the context they are used in. They mean one thing in the business field and something else in everyday life. Generally they reduce clarity and can also convey the opposite meaning of what you intend to communicate in your data storytelling.

    Don’t make your story too long, focus on explaining the meaning of data rather than the ornateness of your language, and humor of your anecdotes. Avoid overusing buzzwords or industry jargon and try to figure out what insights your listeners want to draw from the data you show them.

    4. Utilize the power of storytelling

    Before we continue our journey into data-powered storytelling, we’d like to further illustrate the unrivaled the power of offering your audience, staff, or partners inspiring narratives by sharing these must-know insights:

    • Recent studies suggest that 80% of today’s consumers want brands to tell stories about their business or products.
    • The average person processes 100 to 500 digital words every day. By taking your data and transforming it into a focused, value-driven narrative, you stand a far better chance of your message resonating with your audience and yielding the results you desire.
    • Human beings absorb information 60 times faster with visuals than with linear text-based content alone. By harnessing the power of data visualization to form a narrative, you’re likely to earn an exponentially greater level of success from your internal or external presentations.

    lease also take a look at part 2 of this interesting read, including presentation tips and examples of dashboard storytelling.

    Author: Sandra Durcevic

    Source: Datapine

  • Dashboard storytelling: The perfect presentation (part 2)

    Dashboard storytelling: The perfect presentation (part 2)

    In the first part of this article, we have introduced the phenomenon of dashboard storytelling and some tips and tricks to get started with it. If you haven´t read part 1 of this article, make sure you do that! You can find part 1 here.

    How to present a dashboard – 6 Tips for the perfect dashboard storytelling presentation

    Now that we’ve covered the data-driven storytelling essentials, it’s time to dig deeper into ways that you can make maximum impact with your storytelling dashboard presentations.

    Business dashboards are now driving forces for visualization in the field of business intelligence. Unlike their predecessors, a state-of-the-art dashboard builder gives presenters the ability to engage audiences with real-time data and offer a more dynamic approach to presenting data compared to the rigid, linear nature of, say, Powerpoint for example.

    With the extra creative freedom data dashboards offer, the art of storytelling is making a reemergence in the boardroom. The question now is: What determines great dashboarding?

    Without further ado, here are six tips that will help you to transform your presentation into a story and rule your own company through dashboard storytelling.

    1. Set up your plan

    Start at square one on how to present a dashboard: outline your presentation. Like all good stories, the plot should be clear, problems should be presented, and an outcome foreshadowed. You have to ask yourself the right data analysis questions when it comes to exploring the data to get insights, but you also need to ask yourself the right questions when it comes to presenting such data to a certain audience. Which information do they need to know or want to see? Make sure you have a concise storyboard when you present so you can take the audience along with you as you show off your data. Try to be purpose-driven to get the best dashboarding outcomes, but don’t entangle yourself in a rigid format that is unchangeable.

    2. Don’t be afraid to show some emotion

    Stephen Few, a leading design consultant, explains on his blog that “when we appeal to people’s emotions strictly to help them personally connect with information and care about it, and do so in a way that draws them into reasoned consideration of the information, not just feeling, we create a path to a brighter, saner future”. Emotions stick around much longer in a person’s psyche than facts and charts. Even the most analytical thinkers out there will be more likely to remember your presentation if you can weave elements of human life and emotion. How to present a dashboard with emotion? By adding some anecdotes, personal life experiences that everyone can relate to, or culturally shared moments and jokes.

    However, do not rely just on emotions to make your point. Your conclusions and ideas need to be backed by data, science, and facts. Otherwise, and especially in business contexts, you might not be taken seriously. You’d also miss an opportunity to help people learn to make better decisions by using reason and would only tap into a “lesser-evolved” part of humanity. Instead, emotionally appeal to your audience to drive home your point.

    3. Make your story accessible to people outside your sector

    Combining complicated jargon, millions of data points, advanced math concepts, and making a story that people can understand is not an easy task. Opt for simplicity and clear visualizations to increase the level of audience engagement.

    Your entire audience should be able to understand the points that you are driving home. Jeff Bladt, the director of Data Products Analytics at DoSomething.org, offered a pioneering case study on accessibility through data. When commenting on how he goes from 350 million data points to organizational change, he shared: “By presenting the data visually, the entire staff was able to quickly grasp and contribute to the conversation. Everyone was able to see areas of high and low engagement. That led to a big insight: Someon outside the analytics team noticed that members in Texas border towns were much more engaged than members in Northwest coastal cities.”

    Making your presentation accessible to laypeople opens up more opportunities for your findings to be put to good use.

    4. Create an interactive dialogue

    No one likes being told what to do. Instead of preaching to your audience, enable them to be a part of the presentation througinteractive dashboard features. By using real-time data, manipulating data points in front of the audience, and encouraging questions during the presentation, you will ensure your audiences are more engaged as you empower them to explore the data on their own. At the same time, you will also provide a deeper context. The interactivity is especially interesting in dashboarding when you have a broad target audience: it onboards newcomers easily while letting the ‘experts’ dig deeper into the data for more insights.

    5. Experiment

    Don’t be afraid to experiment with different approaches to storytelling with data. Create a dashboard storytelling plan that allows you to experiment, test different options, and learn what will build the engagement among your listeners and make sure you fortify your data storytelling with KPIs (Key Performance Indicators). As you try and fail by making them fall asleep or check their email, you will only learn from it and get the information on how to improve your dashboarding and storytelling with data techniques, presentation after presentation.

    6. Balance your words and visuals wisely

    Last but certainly not least is a tip that encompasses all of the above advice but also offers a means of keeping it consistent, accessible, and impactful from start to finish balance your words and visuals wisely.

    What we mean here is that in data-driven storytelling, consistency is key if you want to grip your audience and drive your message home. Our eyes and brains focus on what stands out. The best data storytellers leverage this principle by building charts and graphs with a single message that can be effortlessly understood, highlighting both visually and with words the strings of information that they want their audience to remember the most.

    With this in mind, you should keep your language clear, concise, and simple from start to finish. While doing this, use the best possible visualizations to enhance each segment of your story, placing a real emphasis on any graph, chart, or sentence that you want your audience to take away with them.

    Every single element of your dashboard design is essential, but by emphasizing the areas that really count, you’ll make your narrative all the more memorable, giving yourself the best possible chance of enjoying the results you deserve.

    The best dashboard storytelling examples

    Now that we’ve explored the ways in which you can improve your data-centric storytelling and make the most of your presentations, it’s time for some inspiring storytelling presentation examples. Let’s start with a storytelling dashboard that relates to the retail sector.

    1. A retailer’s store dashboard with KPIs

    The retail industry is an interesting one as it has particularly been disrupted with the advent of online retailing. Collecting data analytics is extremely important for this sector as it can take an excellent advantage out of analytics because of its data-driven nature. And as such, data storytelling with KPIs is a particularly effective method to communicate trends, discoveries and results.

    The first of our storytelling presentation examples serves up the information related to customers’ behavior and helps in identifying patterns in the data collected. The specific retail KPIs tracked here are focused on the sales: by division, by items, by city, and the out-of-stock items. It lets us know what the current trends in customers’ purchasing habits are and allow us to break down this data according to a city or a gender/age for enhanced analysis. We can also anticipate any stock-out to avoid losing money and visualize the stock-out tendencies over time to spot any problems in the supply chain.

    2. A hospital’s management dashboard with KPIs

    This second of our data storytelling examples delivers the tale of a busy working hospital. That might sound a little fancier than it is, but it’s of paramount importance. All the more when it comes to public healthcare, a sector very new to data collection and analytics that has a lot to win from it in many ways.

    For a hospital, a centralized dashboard is a great ally in the everyday management of the facility. The one we have here gives us the big picture of a complex establishment, tracking several healthcare KPIs.

    From the total admissions to the total patients treated, the average waiting time in the ER, or broken down per division, the story told by the healthcare dashboard is essential. The top management of this facility have a holistic view to run the operations more easily and efficiently and can try to implement diverse measures if they see abnormal figures. For instance, an average waiting time for a certain division that is way higher than the others can shed light on some problems this division might be facing: lack of staff training, lack of equipment, understaffed unit, etc.

    All this is vital for the patient’s satisfaction as well as the safety and wellness of the hospital staff that deals with life and death every day.

    3. A human resources (HR) recruitment dashboard with KPIs

    The third of our data storytelling examples relates to human resources. This particular storytelling dashboard focuses on one of the most essential responsibilities of any modern HR department: the recruitment of new talent.

    In today’s world, digital natives are looking to work with a company that not only shares their beliefs and values but offers opportunities to learn, progress, and grow as an individual. Finding the right fit for your organization is essential if you want to improve internal engagement and reduce employee turnover.

    The HR KPIs related to this storytelling dashboard are designed to enhance every aspect of the recruitment journey, helping to drive down economical efficiencies and improving the quality of hires significantly.

    Here, the art of storytelling with KPIs is made easy. This HR dashboard offers a clear snapshot into important aspects of HR recruitment, including the cost per hire, recruiting conversion or success rates, and the time to fill a vacancy from initial contact to official offer.

    With this most intuitive of data storytelling examples, building a valuable narrative that resonates with your audience is made easy, and as such, it’s possible to share your recruitment insights in a way that fosters real change and business growth.

    Final words of advice

    One of the major advantages of working with dashboards is the improvement they have made to data visualization. Don’t let this feature go to waste with your own presentations. Place emphasis on making visuals clear and appealing to get the most from your dashboarding efforts.

    Transform your presentations from static, lifeless work products into compelling stories by weaving an interesting and interactive plot line into them.

    If you haven't read part 1 of this article yet, you can find it here.

    Author: Sandra Durcevic

    Source: Datapine

  • Data governance: using factual data to form subjective judgments

    Data governance: using factual data to form subjective judgments

    Data Warehouses were born of the finance and regulatory age. When you peel away the buzz words, the principle goal of this initial phase of business intelligence was the certification of truth. Warehouses helped to close the books and analyze results. Regulations like Dodd Frank wanted to make sure that you took special care to certify the accuracy of financial results and Basel wanted certainty around capital liquidity and on and on. Companies would spend months or years developing common metrics, KPIs, and descriptions so that a warehouse would accurately represent this truth.

    In our professional lives, many items still require this certainty. There can only be one reported quarterly earnings figure. There can only be one number of beds in a hospital or factories available for manufacturing. However, an increasing number of questions do not have this kind of tidy right and wrong answer. Consider the following:

    • Who are our best customers?
    • Is that loan risky?
    • Who are our most effective employees?
    • Should I be concerned about the latest interest rate hike?

    Words like best, risky, and effective are subjective by their very natures. Jordon Morrow (Qlik) writes and speaks extensively about the importance of data literacy and uses a phrase that has always felt intriguing: data literacy requires the ability to argue with data. This is key when the very nature of what we are evaluating does not have neat, tidy truths.

    Let’s give an example. A retail company trying to liquidate its winter inventory and has asked three people to evaluate the best target list for an e-mail campaign.

    • John downloads last year’s campaign results and collects the names and e-mail addresses of the 2% that responded to the campaign last year with an order.
    • Jennifer thinks about the problem differently. She looks through sales records of anyone who has bought winter merchandise in the past 5 years during the month of March who had more than a 25% discount on the merchandise. She notices that these people often come to the web site to learn about sales before purchasing. Her reasoning is that a certain type of person who likes discounts and winter clothes is the target.
    • Juan takes yet another approach. He looks at social media feeds of brand influencers. He notices that there are 100 people with 1 million or more followers and that social media posts by these people about product sales traditionally cause a 1% spike in sales for the day as their followers flock to the stores. This is his target list.

    So who has the right approach? This is where the ability to argue with data becomes critical. In theory, each of these people should feel confident developing a sales forecast on his or her model. They should understand the metric that they are trying to drive and they should be able to experiment with different ideas to drive a better outcome and confidently state their case.

    While this feels intuitive, enterprise processes and technologies are rarely set up to support this kind of vibrant analytics effort. This kind of analytics often starts with the phrase “I wonder if…” while conventional IT and data governance frameworks are not able generally to deal with questions that a person did not know that they had 6 months before. And yet, “I wonder if” relies upon data that may have been unforeseen. In fact, it usually requires a connection of data sets that have often never been connected before to drive break-out thinking. Data science is about identifying those variables and metrics that might be better predictors of performance. This relies on the analysis of new, potentially unexpected data sets like social media followers, campaign results, web clicks, sales behavior etc. Each of these items might be important for an analysis, but in a world in which it is unclear what is and is not important, how can a governance organization anticipate and apply the same dimensions of quality to all of the hundreds of data sets that people might use? And how can they apply the same kind of rigor to data quality standards for the hundreds of thousands of data elements available as opposed to the 100-300 critical data elements.

    They can’t. And that’s why we need to re-evaluate the nature of data governance for different kinds of analytics.

    Author: Joe Dos Santos

    Source: Qlik

  • Data management: building the bridge between IT and business

    Data management: building the bridge between IT and business

    We all know businesses are trying to do more with their data, but inaccuracy and general data management issues are getting in the way. For most businesses, the status quo for managing data is not always working. However, tnew research shows that data is moving from a knee-jerk, “must be IT’s issue” conversation, to a “how can the business better leverage this rich (and expensive) data resource we have at our fingertips” conversation.

    The emphasis is on “conversation”, business and IT need to communicate in the new age of Artificial Intelligence, Machine Learning and Interactive Analytics. Roles and responsibilities are blurring, and it is expected that a company’s data will quickly turn from a cost-center of IT infrastructure to a revenue-generator for the business. In order to address the issues of control and poor data quality, there needs to be an ever-increasing bridge between IT and the business. This bridge has two component parts. The first one is technology, which is both sophisticated enough to handle complex data issues but easy enough to provide a quick time-to-value. The second one is people who are able to bridge the gap between IT systems/storage/access items and business users need for value and results (enter data analysts and data engineers).

    This bridge needs to be built with three key components in mind:

    • Customer experience:

      For any B2C company, customer experience is the number one hot topic of the day and a primary way they are leveraging data. A new 2019 data management benchmark report found that 98% of companies use data to improve customer experience. And for good reason, between social media, digital streaming services, online retailers and others, companies are looking to show the consumer that they aren’t just a corporation, but that they are the corporation most worthy of building a relationship with. This invariably involves creating a single view of the customer (SVC), and  that view needs to be built around context and based on the needs of the specific department within the business (accounts payable, marketing, customer service, etc.).
    • Trust in data:

      Possessing data and trusting data are two completely different things. Lots of companies have lots of data, but that doesn’t mean they automatically trust it enough to make business-critical decisions with it. Research finds that on average, organizations suspect 29% of current customer/prospect data is inaccurate in some way. In addition, 95% of organizations see impacts in their organization from poor quality data. A lack of trust in the data available to business users paralyzes decisions, and even worse, impacts the ability to make the right decisions based on faulty assumptions. How often have you received a report and questioned the results? More than you’d like to admit, probably. To get around this hurdle, organizations need to drive culture change around data quality strategies and methodologies. Only by completing a full assessment of data, developing a strategy to address the existing and ongoing issues, and implementing a methodology to execute on that strategy, will companies be able to turn the corner from data suspicion to data trust.
    • Changing data ownership:

      The responsibilities between IT and the business are blurring. 70% of businesses say that not having direct control over data impacts their ability to meet strategic objectives. The reality is that the definitions of control are throwing people off. IT thinks of control as storage, systems, and security. The business thinks of control as access, actionable and accurate. The role of the CDO is helping to bridge this gap, bringing the nuts and bolts of IT in line with the visions and aspirations of the business.

    The bottom line is that for most companies data is still a shifting sea of storage, software stacks, and stakeholders. The stakeholders are key, both from IT and the business, and in how the two can combine to provide the oxygen the business needs to survive: better customer experience, more personalization, and an ongoing trust in the data they administrate to make the best decisions to grow their companies and delight their customers.

    Author: Kevin McCarthy

    Source: Dataversity

  • Data Science implementeren is geen ‘Prutsen en Pielen zonder pottenkijkers’

    Belastingdienst

    Van fouten bij de Belastingdienst kunnen we veel leren

    De belastingdienst verkeert opnieuw in zwaar weer. Na de negatieve berichtgeving in 2016 was in Zembla te zien hoe de belastingdienst invulling gaf aan Data Analytics. De broedkamer waarin dat gebeurde stond intern bekend als domein om te 'prutsen en pielen zonder pottenkijkers'.

    Wetgeving met voeten getreden

    Een overheidsdienst die privacy- en aanbestedingswetgeving met voeten treedt staat natuurlijk garant voor tumult en kijkcijfers. En terecht natuurlijk. Vanuit oorzaak en gevolg denken is het echter de vraag of die wetsovertredingen nou wel het meest interessant zijn. Want hoe kon het gebeuren dat een stel whizzkids in datatechnologie onder begeleiding van een extern bureau (Accenture) in een ‘kraamkamer’ werden gezet. En zo, apart van de gehele organisatie, een vrijbrief kregen voor…….Ja voor wat eigenlijk?

    Onder leiding van de directeur van de belastingdienst Hans Blokpoel is er een groot data en analytics team gestart. Missie: alle bij de belastingdienst bekende gegevens te combineren, om zo efficiënter te kunnen werken, fraude te kunnen detecteren en meer belastingopbrengsten te genereren. En zo dus waarde voor de Belastingdienst te genereren. Dit lijkt op een data science strategie. Maar wist de belastingdienst wel echt waar ze mee bezig was? Vacatureteksten die werden gebruikt om data scientists te werven spreken van ‘prutsen en pielen zonder pottenkijkers’.

    De klacht van Zembla is dat het team het niveau van ‘prutsen en pielen’ feitelijk niet ontsteeg. Fysieke beveiliging, authenticatie en autorisatie waren onvoldoende. Het was onmogelijk te zien wie bij de financiële gegevens van 11 miljoen burgers en 2 miljoen bedrijven geweest was, en of deze gedownload of gehackt waren. Er is letterlijk niet aan de wetgeving voldaan.

    Problemen met data science

    Wat bij de Belastingdienst misgaat gebeurt bij heel erg veel bedrijven en organisaties. Een directeur, manager of bestuurder zet data en analytics in om (letterlijk?) slimmer te zijn dan de rest. Geïsoleerd van de rest van de organisatie worden slimme jongens en meisjes zonder restricties aan de slag gezet met data. Uit alle experimenten en probeersels komen op den duur aardige resultaten. Resultaten die de belofte van de 'data driven organisatie' mogelijk moeten maken.

    De case van de belastingdienst maakt helaas eens te meer duidelijk dat er voor een 'data driven organisatie' veel meer nodig is dan de vaardigheid om data te verzamelen en te analyseren. Tot waarde brengen van data vergt visie (een data science strategie), een organisatiewijze die daarop aansluit (de ene data scientist is de andere niet) maar ook kennis van de restricties. Daarmee vraagt het om een cultuur waarin privacy en veiligheid gewaarborgd worden. Voor een adequate invulling van de genoemde elementen heb je een groot deel van de ‘oude’ organisatie nodig alsmede een adequate inbedding van de nieuwe eenheid of funct

    ie.

    Strategie en verwachtingen

    Data science schept verwachtingen. Meer belastinginkomsten met minder kosten, hogere omzet of minder fraude. Efficiency in operatie maar ook effectiviteit in klanttevredenheid. Inzicht in (toekomstige) marktontwikkelingen. Dit zijn hoge verwachtingen. Implementatie van data science vraagt echter ook om investeringen. Stevige investeringen in technologie en hoogopgeleide mensen. Schaarse mensen bovendien met kennis van IT, statistiek, onderzoeksmethodologie etc. Hoge verwachtingen die gepaard gaan met stevige investeringen leiden snel tot teleurstellingen. Teleurstellingen leiden tot druk. Druk leidt niet zelden tot het opzoeken van grenzen. En het opzoeken van grenzen leidt tot problemen. De functie van een strategie is deze dynamiek te voorkomen.

    Het managen van de verhouding tussen verwachtingen en investeringen begint bij een data science strategie. Een antwoord op de vraag: Wat willen we in welke volgorde volgens welke tijdspanne met de implementatie van data science bereiken? Gaan we de huidige processen optimaliseren (business executie strategie) of transformeren (business transformatie strategie)? Of moet het data science team nieuwe wijzen van werken faciliteren (enabling strategie)? Deze vragen zou een organisatie zichzelf moeten stellen alvorens met data science te beginnen. Een helder antwoord op de strategie vraag stuurt de governance (waar moeten we op letten? Wat kan er fout gaan?) maar ook de verwachtingen. Bovendien weten we dan wie er bij de nieuwe functie moet worden betrokken en wie zeker niet.

     

    Governance en excessen

    Want naast een data science strategie vraag adequate governance om een organisatie die in staat is om domeinkennis en expertise uit het veld te kunnen combineren met data. Dat vereist het in kunnen schatten van 'wat kan' en 'wat niet'. En daarvoor heb je een groot deel van de 'oude' organisatie nodig. Lukt dat, dan is de 'data driven organisatie' een feit. Lukt het niet dan kun je wachten op brokken. In dit geval dus een mogelijke blootstelling van alle financiele data van alle 11 miljoen belastingplichtige burgers en 2 miljoen bedrijven. Een branchevreemde data scientist is als een kernfysicus die in experimenten exotische (en daarmee ook potentieel gevaarlijke) toepassingen verzint. Wanneer een organisatie niet stuurt op de doelstellingen en dus data science strategie dan neemt de kans op excessen toe.

     

    Data science is veelmeer dan technologie

    Ervaringsdeskundigen weten al lang dat data science veelmeer is dat het toepassen van moderne technologie op grote hoeveelheden data. Er zijn een aantal belangrijke voorwaarden voor succes. In de eerste plaats gaat het om een visie op hoe data en data technologie tot waarde kunnen worden gebracht. Vervolgens gaat het om de vraag hoe je deze visie organisatorisch wilt realiseren. Pas dan ontstaat een kader waarin data en technologie gericht kunnen worden ingezet. Zo kunnen excessen worden voorkomen en wordt waarde gecreëerd voor de organisatie. Precies deze stappen lijken bij de Belastingdienst te zijn overgeslagen.

     

    Zembla

    De door Zembla belichtte overtreding van wetgeving is natuurlijk een stuk spannender. Vanuit het credo ‘voorkomen is beter dan genezen’ blijft het jammer dat het goed toepassen van data science in organisaties in de uitzending is onderbelicht.

     

    Bron: Business Data Science Leergang Radboud Management Academy http://www.ru.nl/rma/leergangen/bds/

    Auteurs: Alex Aalberts / Egbert Philips

  • Digital transformation strategies and tech investments often at odds

    digitaltransformation

    While decision makers are well aware that digital transformation is essential to their organizations’ future, many are jumping into new technologies that don’t align with their current digital transformation pain points, according to a new report from PointSource, a division of Globant that provides IT solutions.

    All too often decision makers invest in technologies without taking a step back to assess how those technologies fit into their larger digital strategy and business goals, the study said. While the majority of such companies perceive these investments as a fast track to the next level of digital maturity, they are actually taking an avoidable detour. 

    PointSource surveyed more than 600 senior-level decision makers and found that a majority are investing in technology that they don’t feel confident using. In fact, at least a quarter plan to invest more than 25 percent of their 2018 budgets in artificial intelligence (AI), blockchain, voice-activated technologies or facial-recognition technologies.

    However, more than half (53 percent) of companies do not feel prepared to effectively use AI, blockchain or facial-recognition technologies.

    See Also A look inside American Family Insurance's digital transformation office

    Companies are actively focusing on digital transformation, the survey showed. Ninety-four percent have increased focus on digital growth within the last year, and 90 percent said digital plays a central role in their overarching business goals.
    Fifty-seven percent of senior managers are unsatisfied with one or more of the technologies their organizations’ employees rely on. 

    Many companies feel digitally outdated, with 45 percent of decision makers considering their company’s digital infrastructure to be outdated compared with that of their competitors.

    Author: Bob Violino

    Source: Information Management

  • Do data scientists have the right stuff for the C-suite?

    The Data Science Clock v1.1 Simple1What distinguishes strong from weak leaders? This raises the question if leaders are born or can be grown. It is the classic “nature versus nurture” debate. What matters more? Genes or your environment?

    This question got me to thinking about whether data scientists and business analysts within an organization can be more than just a support to others. Can they be become leaders similar to C-level executives? 

    Three primary success factors for effective leaders

    Having knowledge means nothing without having the right types of people. One person can make a big difference. They can be someone who somehow gets it altogether and changes the fabric of an organization’s culture not through mandating change but by engaging and motivating others.

    For weak and ineffective leaders irritating people is not only a sport for them but it is their personal entertainment. They are rarely successful. 

    One way to view successful leadership is to consider that there are three primary success factors for effective leaders. They are (1) technical competence, (2) critical thinking skills, and (3) communication skills. 

    You know there is a problem when a leader says, “I don’t do that; I have people who do that.” Good leaders do not necessarily have high intelligence, good memories, deep experience, or innate abilities that they are born with. They have problem solving skills. 

    As an example, the Ford Motor Company’s CEO Alan Mulally came to the automotive business from Boeing in the aerospace industry. He was without deep automotive industry experience. He has been successful at Ford. Why? Because he is an analytical type of leader.

    Effective managers are analytical leaders who are adaptable and possess systematic and methodological ways to achieve results. It may sound corny but they apply the “scientific method” that involves formulating hypothesis and testing to prove or disprove them. We are back to basics.

    A major contributor to the “scientific method” was the German mathematician and astronomer Johannes Kepler. In the early 1600s Kepler’s three laws of planetary motion led to the Scientific Revolution. His three laws made the complex simple and understandable, suggesting that the seemingly inexplicable universe is ultimately lawful and within the grasp of the human mind. 

    Kepler did what analytical leaders do. They rely on searching for root causes and understanding cause-and-effect logic chains. Ultimately a well-formulated strategy, talented people, and the ability to execute the executive team’s strategy through robust communications are the key to performance improvement. 

    Key characteristics of the data scientist or analyst as leader

    The popular Moneyball book and subsequent movie about baseball in the US demonstrated that traditional baseball scouts methods (e.g., “He’s got a good swing.”) gave way to fact-based evidence and statistical analysis. Commonly accepted traits of a leader, such as being charismatic or strong, may also be misleading.

    My belief is that the most scarce resource in an organization is human ability and competence. That is why organizations should desire that every employee be developed for growth in their skills. But having sound competencies is not enough. Key personal qualities complete the package of an effective leader. 

    For a data scientist or analyst to evolve as an effective leader three personal quality characteristics are needed: curiosity, imagination, and creativity. The three are sequentially linked. Curious people constantly ask “Why are things the way they are?” and “Is there a better way of doing things?” Without these personal qualities then innovation will be stifled. The emergence of analytics is creating opportunities for analysts as leaders. 

    Weak leaders are prone to a diagnostic bias. They can be blind to evidence and somehow believe their intuition, instincts, and gut-feel are acceptable masquerades for having fact-based information. In contrast, a curious person always asks questions. They typically love what they do. If they are also a good leader they infect others with enthusiasm. Their curiosity leads to imagination. Imagination considers alternative possibilities and solutions. Imagination in turn sparks creativity.

    Creativity is the implementation of imagination

    Good data scientists and analysts have a primary mission: to gain insights relying on quantitative techniques to result in better decisions and actions. Their imagination that leads to creativity can also result in vision. Vision is a mark of a good leader. In my mind, an executive leader has one job (aside from hiring good employees and growing them). That job is to answer the question, “Where do we want to go?” 

    After that question is answered then managers and analysts, ideally supported by the CFO’s accounting and finance team, can answer the follow-up question, “How are we going to get there?” That is where analytics are applied with the various enterprise and corporate performance management (EPM/CPM) methods that I regularly write about. EPM/CPM methods include a strategy map and its associated balance scorecard with KPIs; customer profitability analysis; enterprise risk management (ERM), and capacity-sensitive driver-based rolling financial forecasts and plans. Collectively they assure that the executive team’s strategy can be fully executed.

    My belief is that that other perceived characteristics of a good leader are over-rated. These include ambition, team spirit, collegiality, integrity, courage, tenacity, discipline, and confidence. They are nice-to-have characteristics, but they pale compared to the technical competency and critical thinking and communications skills that I earlier described. 

    Be analytical and you can be a leader. You can eventually serve in a C-suite role

    Author: Gary Cokins 

    Source: Information Management

  • E-commerce and the growing importance of data

    E-commerce and the growing importance of data

    E-commerce is claiming a bigger role in global retail. In the US for example, e-commerce currently accounts for approximately 10% of all retail sales, a number that is projected to increase to nearly 18% by 2021. To a large extent, the e-commerce of the present exists in the shadow of the industry’s early entrant and top player, Amazon. Financial analysts predict that the retail giant will control 50% of the US’ online retail sales by 2021, leaving other e-commerce stores frantically trying to take a page out of the company’s incredibly successful online retail playbook.

    While it seems unlikely that another mega-retailer will rise to challenge Amazon’s e-commerce business in the near future, at least 50% of the online retail market is wide open. Smaller and niche e-commerce stores have a ;arge opportunity to reach specialized audiences, create return customers, and cultivate persistent brand loyalty. Amazon may have had a first-mover advantage, but the rise in big data and the ease of access to analytics means that smaller companies can find areas in which to compete and improve margins. As e-retailers look for ways to expand revenues while remaining lean, data offers a way forward for smart scalability.

    Upend your back-end

    While data can improve e-commerce’s customer-facing interactions, it can have just as major an impact on the customer experience factors that take place off camera. Designing products that customers want, having products in stock, making sure that products ship on schedule, all these kind of back-end operations play a part in shaping customer experience and satisfaction. In order to shift e-commerce from a product-centric to a consumer-centric model, e-commerce companies need to invest in unifying customer data to inform internal processes and provide faster, smarter services.

    The field of drop shipping, for instance, is coming into its own thanks to smart data applications. Platforms like Oberlo are leveraging prescriptive analytics to enable intelligent product selection for Shopify stores, helping them curate trending inventory that sells, allowing almost anyone to create their own e-store. Just as every customer touchpoint can be enhanced with big data, e-commerce companies that apply unified big data solutions to their behind-the-scenes benefit from streamlined processes and workflow.

    Moreover, e-commerce companies that harmonize data across departments can identify purchasing trends and act on real-time data to optimize inventory processes. Using centralized data warehouse software like Snowflake empowers companies to create a single version of customer truth to automate reordering points and determine what items they should be stocking in the future. Other factors, such as pricing decisions, can also be finessed using big data to generate specific prices per product that match customer expectations and subsequently sell better.

    Data transforms the customer experience

    When it comes to how data can impact the overall customer experience, e-commerce companies don’t have to invent the wheel. There’s a plethora of research that novice and veteran data explorers can draw on when it comes to optimizing customer experiences on their websites. General findings on the time it takes for customers to form an opinion of a website, customers’ mobile experience expectations, best times to send promotional emails and many more metrics can guide designers and developers tasked with improving e-commerce site traffic.

    However, e-commerce sites that are interested in more benefits will need to invest in more specific data tools that provide a 360-degree view of their customers. Prescriptive analytic tools like Tableau empower teams to connect the customer dots by synthesizing data across devices and platforms. Data becomes valuable as it provides insights that allow companies to make smarter decisions based on each consumer identify inbound marketing opportunities and automate recommendations and discounts based on the customer’s previous behavior.

    Data can also inspire changes in a field that has always dominated the customer experience: customer support. The digital revolution has brought substantial changes in the once sleepy field of customer service, pioneering new ways of direct communication with agents via social media and introducing the now ubiquitous AI chatbots. In order to provide the highest levels of customer satisfaction throughout these new initiatives, customer support can utilize data to anticipate when they might need more human agents staffing social media channels or the type of AI persona that their customers want to deal with. By improving customer service with data, e-commerce companies can improve the entire customer experience.

    Grow with your data

    As more and more data services migrate to the cloud, e-commerce companies have ever-expanding access to flexible data solutions that both fuel growth and scale alongside the businesses they’re helping. Without physical stores to facilitate face-to-face relationships, e-commerce companies are tasked with transforming their digital stores into online spaces that customers connect with and ultimately want to purchase from again and again.

    Data holds the key to this revolution. Instead of trying to force their agenda upon customers or engage in wild speculations about customer desires, e-commerce stores can use data to craft narratives that engage customers, create a loyal brand following, and drive increasing profits. With only about 2.5% of e-commerce & web visits converting to saleson average, e-commercecompanies that want to stay competitive must open themselves up to big data and the growth opportunities it offers.

    Author: Ralph Tkatchuck

    Source: Dataconomy

  • Een eerste indruk van de fusie tussen Cloudera en Hortonworks

    Een eerste indruk van de fusie tussen Cloudera en Hortonworks

    Een aantal maanden geleden werd bekend dat big data-bedrijven Cloudera en Hortonworks gaan fuseren. De overname is inmiddels goedgekeurd en Cloudera en Hortonworks gaan verder als één bedrijf. Techzine ging in gesprek met Wim Stoop, senior product marketing manager bij Cloudera. Stoop heeft alle ins en outs wat betreft de visie rond deze fusie en wat de fusie betekent voor bedrijven en data analisten die met de producten van de twee bedrijven werken.

    Stoop vertelt dat deze fusie min of meer het perfecte huwelijk is. Beide bedrijven houden zich bezig met big data op basis van Hadoop en hebben zich de afgelopen jaren hierin gespecialiseerd. Zo is Hortonworks erg goed in Hadoop Data Flow (HDF), werken met streaming data die snel in het Hadoop platform moeten worden toegevoegd. 

    Cloudera data science workbench

    Cloudera heeft met zijn data science workbench een goede oplossing in handen voor data analisten. Zij kunnen met deze workbench snel en eenvoudig data combineren en analyseren, zonder dat je daarvoor direct extreem veel rekenkracht nodig hebt. Met de workbench van Cloudera kun je experimenteren en testen om te zien wat voor uitkomsten dit biedt, voordat je het meteen op grote schaal toepast. Het belangrijkste voordeel is dat de workbench overweg kan met enorm veel programmeertalen, waardoor iedere data analist in zijn eigen favoriete taal kan werken. De workbench houdt tevens exact bij welke stappen zijn doorlopen om tot een resultaat te komen. De uitkomst is weliswaar belangrijk, maar het algoritme en methoden die leiden tot het eindresultaat zijn minstens net zo belangrijk.

    De route naar één oplossing

    Als je er dieper op in gaat dan zijn er natuurlijk veel meer zaken waar juist Hortonworks of Cloudera heel erg goed in is. Of welke technologie net even beter of efficiënter is dan de andere. Dat zal het nieuwe bedrijf dwingen tot harde keuzes, maar volgens Stoop gaat dat allemaal wel goed komen. De behoefte aan een goed dataplatform is enorm groot, dat er dan keuzes gemaakt moeten worden is onvermijdelijk. Uiteindelijk speelt het bedrijf hiermee in op de kritiek die er op Hadoop is. Hadoop zelf vormt de basis van de database, maar daarboven zijn er zo veel verschillende modules die data kunnen inlezen, uitlezen of verwerken. Daardoor is het overzicht soms ver te zoeken. Het feit dat er zoveel oplossingen zijn heeft te maken met het open source karakter en de steun van bedrijven als Cloudera en Hortonworks, die bij veel projecten de grootste bijdrager zijn. Dat gaat ook veranderen met deze fusie. Er komt dit jaar nog een nieuw platform met de naam Cloudera Data Platform. In dit platform zullen de beste onderdelen van Hortonworks en Cloudera worden samengevoegd. Het betekent ook dat conflicterende projecten of modules goed nieuws zullen zijn voor de een maar slecht nieuws voor de ander. Voor het verwerken van metadata gebruiken beide bedrijven nu een andere oplossing, in het Cloudera Data Platform zullen we er maar één terug zien. Dat betekent dat het aantal modules een stukje minder wordt en alles overzichtelijker wordt, wat voor alle betrokkenen positief is.

    Cloudera Data Platform

    De naam van het nieuwe bedrijf was nog niet aan bod gekomen. De bedrijven hebben gekozen voor een fusie, maar uiteindelijk zal de naam Hortonworks gewoon verdwijnen. Het bedrijf gaat verder als Cloudera, vandaar ook de naam Cloudera Data Platform. De bedoeling is dat het Cloudera Data Platform dit jaar nog beschikbaar wordt, zodat klanten ermee kunnen gaan testen. Zodra het platform stabiel en volwassen genoeg is, krijgen klanten het advies om te migreren naar dit nieuwe platform. Alle bestaande Cloudera en Hortonworks producten zullen uiteindelijk gaan verdwijnen, maar tot eind 2022 blijven deze producten wel volledig ondersteund. Daarna moet iedereen echter over op het Cloudera Data Platform. Cloudera heeft in de meest recente versies van zijn huidige producten al rekening gehouden met een migratietraject. Bij Hortonworks zal dit vanaf nu ook gaan gebeuren. Het bedrijf gaat stappen zetten zodat bestaande producten en het nieuwe Data Platform in staat zijn om samen te werken bij de migratie naar het nieuwe platform.

    Shared data experience

    Een andere innovatie die volgens Stoop in de toekomst steeds belangrijker wordt is de shared data experience. Als klanten Cloudera producten gebruiken dan kunnen deze Hadoop-omgevingen eenvoudig aan elkaar gekoppeld worden, zodat ook de resources (CPU, GPU, geheugen) gecombineerd kunnen worden bij het analyseren van data. Stel dat een bedrijf Cloudera-omgevingen voor data-analyses heeft in eigen datacenters én cloudplatformen, maar dat het daarna ineens een heel groot project moet analyseren. In dat geval zou het al die omgevingen kunnen combineren en gezamenlijk kunnen inzetten. Daarnaast is het mogelijk om bijvoorbeeld data van lokale kantoren/filialen te combineren.

    Door fusie meer innovatie mogelijk

    Een gigantisch voordeel van deze fusie is volgens Stoop de ontwikkelcapaciteit die beschikbaar wordt om nieuwe innovatieve oplossingen te ontwikkelen. De bedrijven waren nu vaak afzonderlijk van elkaar aan vergelijkbare projecten aan het werken. Beide bedrijven droegen bijvoorbeeld bij aan een verschillend project dat om kan gaan met metadata in Hadoop. Uiteindelijk was een van de twee het wiel opnieuw aan het uitvinden, dat is nu niet meer nodig. Gezien de huidige arbeidsmarkt is het vinden van ontwikkelaars die de juiste passie en kennis hebben voor data analyse enorm lastig. Met deze fusie kan er veel efficiënter gewerkt gaan worden en kunnen er flink wat teams ingezet worden voor het ontwikkelen van nieuwe innovatieve oplossingen. Deze week vindt de Hortonworks Datasummit plaats in Barcelona. Daar zal ongetwijfeld meer bekend worden gemaakt over de fusie, de producten en de status van het nieuwe Cloudera Data Platform.

    Auteur: Coen van Eenbergen

    Bron: Techzine

     

  • Effective data analysis methods in 10 steps

    Effective data analysis methods in 10 steps

    In this data-rich age, understanding how to analyze and extract true meaning from the digital insights available to our business is one of the primary drivers of success.

    Despite the colossal volume of data we create every day, a mere 0.5% is actually analyzed and used for data discovery, improvement, and intelligence. While that may not seem like much, considering the amount of digital information we have at our fingertips, half a percent still accounts for a huge amount of data.

    With so much data and so little time, knowing how to collect, curate, organize, and make sense of all of this potentially business-boosting information can be a minefield, but online data analysisis the solution.

    To help you understand the potential of analysis and how you can use it to enhance your business practices, we will answer a host of important analytical questions. Not only will we explore data analysis methods and techniques, but we’ll also look at different types of data analysis while demonstrating how to do data analysis in the real world with a 10-step blueprint for success.

    What is a data analysis method?

    Data analysis methods focus on strategic approaches to taking raw data, mining for insights that are relevant to a business’s primary goals, and drilling down into this information to transform metrics, facts, and figures into initiatives that benefit improvement.

    There are various methods for data analysis, largely based on two core areas: quantitative data analysis methods and data analysis methods in qualitative research.

    Gaining a better understanding of different data analysis techniques and methods, in quantitative research as well as qualitative insights, will give your information analyzing efforts a more clearly defined direction, so it’s worth taking the time to allow this particular knowledge to sink in.

    Now that we’ve answered the question, ‘what is data analysis?’, considered the different types of data analysis methods, it’s time to dig deeper into how to do data analysis by working through these 10 essential elements.

    1. Collaborate your needs

    Before you begin to analyze your data or drill down into any analysis techniques, it’s crucial to sit down collaboratively with all key stakeholders within your organization, decide on your primary campaign or strategic goals, and gain a fundamental understanding of the types of insights that will best benefit your progress or provide you with the level of vision you need to evolve your organization.

    2. Establish your questions

    Once you’ve outlined your core objectives, you should consider which questions will need answering to help you achieve your mission. This is one of the most important steps in data analytics as it will shape the very foundations of your success.

    To help you ask the right things and ensure your data works for you, you have to ask the right data analysis questions.

    3. Harvest your data

    After giving your data analytics methodology real direction and knowing which questions need answering to extract optimum value from the information available to your organization, you should decide on your most valuable data sources and start collecting your insights, the most fundamental of all data analysis techniques.

    4. Set your KPIs

    Once you’ve set your data sources, started to gather the raw data you consider to potentially offer value, and established clearcut questions you want your insights to answer, you need to set a host of key performance indicators (KPIs) that will help you track, measure, and shape your progress in a number of key areas.

    KPIs are critical to both data analysis methods in qualitative research and data analysis methods in quantitative research. This is one of the primary methods of analyzing data you certainly shouldn’t overlook.

    To help you set the best possible KPIs for your initiatives and activities, explore our collection ofkey performance indicator examples.

    5. Omit useless data

    Having defined your mission and bestowed your data analysis techniques and methods with true purpose, you should explore the raw data you’ve collected from all sources and use your KPIs as a reference for chopping out any information you deem to be useless.

    Trimming the informational fat is one of the most crucial steps of data analysis as it will allow you to focus your analytical efforts and squeeze every drop of value from the remaining ‘lean’ information.

    Any stats, facts, figures, or metrics that don’t align with your business goals or fit with your KPI management strategies should be eliminated from the equation.

    6. Conduct statistical analysis

    One of the most pivotal steps of data analysis methods is statistical analysis.

    This analysis method focuses on aspects including cluster, cohort, regression, factor, and neural networks and will ultimately give your data analysis methodology a more logical direction.

    Here is a quick glossary of these vital statistical analysis terms for your reference:

    • Cluster: The action of grouping a set of elements in a way that said elements are more similar (in a particular sense) to each other than to those in other groups, hence the term ‘cluster’.
    • Cohort: A subset of behavioral analytics that takes insights from a given data set (e.g. a web application or CMS) and instead of looking at everything as one wider unit, each element is broken down into related groups.
    • Regression: A definitive set of statistical processes centered on estimating the relationships among particular variables to gain a deeper understanding of particular trends or patterns.
    • Factor: A statistical practice utilized to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called ‘factors’. The aim here is to uncover independent latent variables.
    • Neural networks: A neural network is a form of machine learning which is far too comprehensive to summarize, but this explanation will help paint you a fairly comprehensive picture.

    7. Build a data management roadmap

    While (at this point) this particular step is optional (you will have already gained a wealth of insight and formed a fairly sound strategy by now), creating a data governance roadmap will help your data analysis methods and techniques become successful on a more sustainable basis. These roadmaps, if developed properly, are also built so they can be tweaked and scaled over time.

    Invest ample time in developing a roadmap that will help you store, manage, and handle your data internally, and you will make your analysis techniques all the more fluid and functional.

    8. Integrate technology

    There are many ways to analyze data, but one of the most vital aspects of analytical success in a business context is integrating the right  decision support software and technology.

    Robust analysis platforms will not only allow you to pull critical data from your most valuable sources while working with dynamic KPIs that will offer you actionable insights, it will also present the information in a digestible, visual, interactive format from one central, live dashboard. A data analytics methodology you can count on.

    By integrating the right technology for your statistical method data analysis and core data analytics methodology, you’ll avoid fragmenting your insights, saving you time and effort while allowing you to enjoy the maximum value from your business’s most valuable insights.

    9. Answer your questions

    By considering each of the above efforts, working with the right technology, and fostering a cohesive internal culture where everyone buys into the different ways to analyze data as well as the power of digital intelligence, you will swiftly start to answer to your most important, burning business questions. 

    10. Visualize your data

    Arguably, the best way to make your data analysis concepts accessible across the organization is through data visualization. An online data visualization is a powerful tool as it lets you tell a story with your metrics, allowing users across the business to extract meaningful insights that aid business evolution. It also covers all the different ways to analyze data.

    The purpose of data analysis is to make your entire organization more informed and intelligent, and with the right platform or dashboard, this can be simpler than you think.

    Data analysis in the big data environment

    Big data is invaluable to today’s businesses, and by using different methods for data analysis, it’s possible to view your data in a way that can help you turn insight into positive action.

    To inspire your efforts and put the importance of big data into context, here are some insights that could prove helpful. Some facts that will help shape your big data analysis techniques.

    • By 2020, around 7 megabytes of new information will be generated every second for every single person on the planet.
    • A 10% boost in data accessibility will result in more than $65 million extra net income for your average Fortune 1000 company.
    • 90% of the world’s big data was created in the past three years.
    • According to Accenture, 79% of notable business executives agree that companies that fail to embrace big data will lose their competitive position and could face extinction. Moreover, 83% of business execs have implemented big data projects to gain a competitive edge.

    Data analysis concepts may come in many forms, but fundamentally, any solid data analysis methodology will help to make your business more streamlined, cohesive, insightful and successful than ever before.

    Author: Sandra Durcevic

    Source: Datapine

  • Exploring the risks of artificial intelligence

    shutterstock 117756049“Science has not yet mastered prophecy. We predict too much for the next year and yet far too little for the next ten.”

    These words, articulated by Neil Armstrong at a speech to a joint session of Congress in 1969, fit squarely into most every decade since the turn of the century, and it seems to safe to posit that the rate of change in technology has accelerated to an exponential degree in the last two decades, especially in the areas of artificial intelligence and machine learning.

    Artificial intelligence is making an extreme entrance into almost every facet of society in predicted and unforeseen ways, causing both excitement and trepidation. This reaction alone is predictable, but can we really predict the associated risks involved?

    It seems we’re all trying to get a grip on potential reality, but information overload (yet another side affect that we’re struggling to deal with in our digital world) can ironically make constructing an informed opinion more challenging than ever. In the search for some semblance of truth, it can help to turn to those in the trenches.

    In my continued interview with over 30 artificial intelligence researchers, I asked what they considered to be the most likely risk of artificial intelligence in the next 20 years.

    Some results from the survey, shown in the graphic below, included 33 responses from different AI/cognitive science researchers. (For the complete collection of interviews, and more information on all of our 40+ respondents, visit the original interactive infographic here on TechEmergence).

    Two “greatest” risks bubbled to the top of the response pool (and the majority are not in the autonomous robots’ camp, though a few do fall into this one). According to this particular set of minds, the most pressing short- and long-term risks is the financial and economic harm that may be wrought, as well as mismanagement of AI by human beings.

    Dr. Joscha Bach of the MIT Media Lab and Harvard Program for Evolutionary Dynamics summed up the larger picture this way:

    “The risks brought about by near-term AI may turn out to be the same risks that are already inherent in our society. Automation through AI will increase productivity, but won’t improve our living conditions if we don’t move away from a labor/wage based economy. It may also speed up pollution and resource exhaustion, if we don’t manage to install meaningful regulations. Even in the long run, making AI safe for humanity may turn out to be the same as making our society safe for humanity.”

    Essentially, the introduction of AI may act as a catalyst that exposes and speeds up the imperfections already present in our society. Without a conscious and collaborative plan to move forward, we expose society to a range of risks, from bigger gaps in wealth distribution to negative environmental effects.

    Leaps in AI are already being made in the area of workplace automation and machine learning capabilities are quickly extending to our energy and other enterprise applications, including mobile and automotive. The next industrial revolution may be the last one that humans usher in by their own direct doing, with AI as a future collaborator and – dare we say – a potential leader.

    Some researchers believe it’s a matter of when and not if. In Dr. Nils Nilsson’s words, a professor emeritus at Stanford University, “Machines will be singing the song, ‘Anything you can do, I can do better; I can do anything better than you’.”

    In respect to the drastic changes that lie ahead for the employment market due to increasingly autonomous systems, Dr. Helgi Helgason says, “it’s more of a certainty than a risk and we should already be factoring this into education policies.”

    Talks at the World Economic Forum Annual Meeting in Switzerland this past January, where the topic of the economic disruption brought about by AI was clearly a main course, indicate that global leaders are starting to plan how to integrate these technologies and adapt our world economies accordingly – but this is a tall order with many cooks in the kitchen.

    Another commonly expressed risk over the next two decades is the general mismanagement of AI. It’s no secret that those in the business of AI have concerns, as evidenced by the $1 billion investment made by some of Silicon Valley’s top tech gurus to support OpenAI, a non-profit research group with a focus on exploring the positive human impact of AI technologies.

    “It’s hard to fathom how much human-level AI could benefit society, and it’s equally hard to imagine how much it could damage society if built or used incorrectly,” is the parallel message posted on OpenAI’s launch page from December 2015. How we approach the development and management of AI has far-reaching consequences, and shapes future society’s moral and ethical paradigm.

    Philippe Pasquier, an associate professor at Simon Fraser University, said “As we deploy more and give more responsibilities to artificial agents, risks of malfunction that have negative consequences are increasing,” though he likewise states that he does not believe AI poses a high risk to society on its own.

    With great responsibility comes great power, and how we monitor this power is of major concern.

    Dr. Pei Wang of Temple University sees major risk in “neglecting the limitations and restrictions of hot techniques like deep learning and reinforcement learning. It can happen in many domains.” Dr. Peter Voss, founder of SmartAction, expressed similar sentiments, stating that he most fears “ignorant humans subverting the power and intelligence of AI.”

    Thinking about the risks associated with emerging AI technology is hard work, engineering potential solutions and safeguards is harder work, and collaborating globally on implementation and monitoring of initiatives is the hardest work of all. But considering all that’s at stake, I would place all my bets on the table and argue that the effort is worth the risk many times over.

    Source: Tech Crunch

  • Facing the major challenges that come with big data

    Facing the major challenges that come with big data

    Worldwide, over 2.5 quintillion bytes of data are created every day. And with the expansion of the Internet of Things (IoT), that pace is increasing. 90% of the current data in the world was generated in the last two years alone. When it comes to businesses, for a forward thinking, digitally transforming organization, you’re going to be dealing with data. A lot of data. Big data.

    Challenges faced by businesses

    While simply collecting lots of data presents comparatively few problems, most businesses run into two significant roadblocks in its use: extracting value and ensuring responsible handling of data to the standard required by data privacy legislation like GDPR. What most people don’t appreciate is the sheer size and complexity of the data sets that organizations have to store and the related IT effort, requiring teams of people working on processes to ensure that others can access the right data in the right way, when they need it, to drive essential business functions. All while ensuring personal information is treated appropriately.

    The problem comes when you’ve got multiple teams around the world, all running to different beats, without synchronizing. It’s a bit like different teams of home builders, starting work independently, from different corners of a new house. If they have all got their own methods and bricks, then by the time they meet in the middle, their efforts won’t match up. It’s the same in the world of IT. If one team is successful, then all teams should be able to learn those lessons of best practice. Meanwhile, siloed behavior can become “free form development” where developers write code to suit a specific problem that their department is facing, without reference to similar or diverse problems that other departments may be experiencing.

    In addition, often there simply aren’t enough builders going around to get these data projects turned around quickly, which can be a problem in the face of heightening business demand. In the scramble to get things done at the pace of modern business, at the very least there will be some duplication of effort, but there’s also a high chance of confusion and the foundations for future data storage and analysis won’t be firm. Creating a unified, standard approach to data processing is critical – as is finding a way to implement it with the lowest possible level of resource, at the fastest possible speeds.

    Data Vault automation

    One of the ways businesses can organize data to meet both the needs for standardization and flexibility is in a Data Vault environment. This data warehousing methodology is designed to bring together information from multiple different teams and systems into a centralized repository, providing a bedrock of information that teams can use to make decisions – it includes all of the data, all of the time, ensuring that no information is missed out of the process.

    However, while a Data Vault design is a good architect’s drawing, it won’t get the whole house built on its own. Developers can still code and build it manually over time but given its complexity they certainly cannot do this quickly, and potentially may not be able to do it in a way that can stand up to the scrutiny of data protection regulations like GDPR. Building a Data Vault environment by hand, even using standard templates, can be incredibly laborious and potentially error prone.

    This is where Data Vault automation comes in, taking care of the 90% or so of an organization’s data infrastructure that fits standardized templates and the stringent requirements that the Data Vault 2.0 methodology demands. Data vault automation can lay out the core landscape of a Data Vault, as well as make use of reliable, consistent metadata to ensure information, including personal information, can be monitored both at its source and over time as records are changed.

    Author: Dan Linstedt

    Source: Insidebigdata

  • Five Mistakes That Can Kill Analytics Projects

    Launching an effective digital analytics strategy is a must-do to understand your customers. But many organizations are still trying to figure out how to get business values from expensive analytics programs. Here are 5 common analytics mistakes that can kill any predictive analytics effort.

    Why predictive analytics projects fail

    failure of analytics

    Predictive Analytics is becoming the next big buzzword in the industry. But according to Mike Le, co-founder and chief operating officer at CB/I Digital in New York, implementing an effective digital analytics strategy has proven to be very challenging for many organizations. “First, the knowledge and expertise required to setup and analyze digital analytics programs is complicated,” Le notes. “Second, the investment for the tools and such required expertise could be high. Third, many clients see unclear returns from such analytics programs. Learning to avoid common analytics mistakes will help you save a lot of resources to focus on core metrics and factors that can drive your business ahead.” Here are 5 common mistakes that Le says cause many predictive analytics projects to fail.

    Mistake 1: Starting digital analytics without a goal

    “The first challenge of digital analytics is knowing what metrics to track, and what value to get out of them,” Le says. “As a result, we see too many web businesses that don’t have basic conversion tracking setup, or can’t link the business results with the factors that drive those results. This problem happens because these companies don’t set a specific goal for their analytics. When you do not know what to ask, you cannot know what you'll get. The purpose of analytics is to understand and to optimize. Every analytics program should answer specific business questions and concerns. If your goal is to maximize online sales, naturally you’ll want to track the order volume, cost-per-order, conversion rate and average order value. If you want to optimize your digital product, you’ll want to track how users are interact with your product, the usage frequency and the churn rate of people leaving the site. When you know your goal, the path becomes clear.”

    Mistake 2: Ignoring core metrics to chase noise

    “When you have advanced analytics tools and strong computational power, it’s tempting to capture every data point possible to ‘get a better understanding’ and ‘make the most of the tool,’” Le explains. “However, following too many metrics may dilute your focus on the core metrics that reveal the pressing needs of the business. I've seen digital campaigns that fail to convert new users, but the managers still setup advanced tracking programs to understand user 

    behaviors in order to serve them better. When you cannot acquire new users, your targeting could be wrong, your messaging could be wrong or there is even no market for your product - those problems are much bigger to solve than trying to understand your user engagement. Therefore, it would be a waste of time and resources to chase fancy data and insights while the fundamental metrics are overlooked. Make sure you always stay focus on the most important business metrics before looking broader.”

    Mistake 3: Choosing overkill analytics tools

    “When selecting analytics tools, many clients tend to believe that more advanced and expensive tools can give deeper insights and solve their problems better,” Le says. “Advanced analytics tools may offer more sophisticated analytic capabilities over some fundamental tracking tools. But whether your business needs all those capabilities is a different story. That's why the decision to select an analytics tool should be based on your analytics goals and business needs, not by how advanced the tools are. There’s no need to invest a lot of money on big analytics tools and a team of experts for an analytics program while some advanced features of free tools like Google Analytics can already give you the answers you need.”

    Mistake 4: Creating beautiful reports with little business value

    “Many times you see reports that simply present a bunch of numbers exported from tools, or state some ‘insights’ that has little relevance to the business goal,” Le notes. “This problem is so common in the analytics world, because a lot of people create reports for the sake of reporting. They don’t think about why those reports should exist, what questions they answer and how those reports can add value to the business. Any report must be created to answer a business concern. Any metrics that do not help answer business questions should be left out. Making sense of data is hard. Asking right questions early will

    help.”

    Mistake 5: Failing to detect tracking errors

    “Tracking errors can be devastating to businesses, because they produce unreliable data and misleading analysis,” Le cautions. “But many companies do not have the skills to setup tracking properly, and worse, to detect tracking issues when they happen. There are many things that can go wrong, such as a developer mistakenly removing the tracking pixels, transferring incorrect values, the tracking code firing unstably or multiple times, wrong tracking rule's logic, etc. The difference could be so subtle that the reports look normal, or are only wrong in certain scenarios. Tracking errors easily go undetected because it takes a mix of marketing and tech skills. Marketing teams usually don’t understand how tracking works, and development teams often don’t know what ‘correct’ means. To tackle this problem, you should frequently check your data accuracy and look for unusual signs in reports. Analysts should take an extra step to learn the technical aspect of tracking, so they can better sense the problems and raise smart questions for the technical team when the data looks suspicious.”

    Author: Mike Le

    Source: Information Management

  • Gaining advantages with the IoT through 'Thing Management'

    Gaining advantages with the IoT through 'Thing Management'

    Some are calling the industrial Internet of Things the next industrial revolution, bringing dramatic changes and improvements to almost every sector. But to be sure it’s successful, there is one big question: how can organizations manage all the new things that are part of their organizations’ landscapes?

    Most organizations see asset management as the practice of tracking and managing IT devices such as routers, switches, laptops and smartphones. But that’s only part of the equation nowadays. With the advent of the IoT, enterprise things now include robotic bricklayers, agitators, compressors, drug infusion pumps, track loaders, scissor lifts and the list goes on and on, while all these things are becoming smarter and more connected.

    These are some examples for specific industries:

    ● Transportation is an asset-intensive industry that relies on efficient operations to achieve maximum profitability. To help customers manage these important assets, GE Transportation is equipping its locomotives with devices that manage hundreds of data elements per second. The devices decipher locomotive data and uncover use patterns that keep trains on track and running smoothly.

    ● The IoT’s promise for manufacturing is substantial. The IoT can build bridges that help solve the frustrating disconnects among suppliers, employees, customers, and others. In doing so, the IoT can create a cohesive environment where every participant is invested in and contributing to product quality and every customer’s feedback is learned from. Smart sensors, for instance, can ensure that every item, from articles of clothing to top-secret defense weapons, can have the same quality as the one before. The only problem with this is that the many pieces of the manufacturing puzzle and devices in the IoT are moving so quickly that spreadsheets and human analysis alone are not enough to manage the devices.

    ● IoT in healthcare will help connect a multitude of people, things with smart sensors (such as wearables and medical devices), and environments. Sensors in IoT devices and connected “smart” assets can capture patient vitals and other data in real time. Then data analytics technologies, including machine learning and artificial intelligence (AI), can be used to realize the promise of value-based care. There’s significant value to be gained, including operational efficiencies that boost the quality of care while reducing costs, clinical improvements that enable more accurate diagnoses, and more.

    ● In the oil and gas industry, IoT sensors have transformed efficiencies around the complex process of natural resource extraction by monitoring the health and efficiency of hard-to-access equipment installations in remote areas with limited connectivity.

    ● Fuelled by greater access to cheap hardware, the IoT is being used with notable success in logistics and fleet management by enabling cost-effective GPS tracking and automated loading/unloading.

    All of these industries will benefit from the IoT. However, as the IoT world expands, these industries and others are looking for ways to track the barrage of new things that are now pivotal to their success. Thing Management pioneers such as Oomnitza help organizations manage devices as diverse as phones, fork lifts, drug infusion pumps, drones and VR headset, providing an essential service as the industrial IoT flourishes.

    Think IoT, not IoP

    To successfully manage these Things, enterprises are not only looking for Thing Management. They also are rethinking the Internet, not as the Internet of People (IoP), but as the Internet of Things (IoP). Things aren’t people, and there are three fundamental differences.

    Many more things are connected to the Internet than people

    John Chambers, former CEO of Cisco, recently declared there will be 500 billion things connected by 2024. That’s nearly 100 times the number of people on the planet.

    Things have more to say than people

    A typical cell phone has nearly 14 sensors, including an accelerometer, GPS, and even a radiation detector. Industrial things such as wind turbines, gene sequencers, and high-speed inserters can easily have over 100 sensors.

    Things can speak much more frequently

    People enter data at a snail’s pace when compared to the barrage of data coming from the IoT. A utility grid power sensor, for instance, can send data 60 times per second, a construction forklift once per minute, and a high-speed inserter once every two seconds.

    Technologists and business people both need to learn how to collect and put all of the data coming from the industrial IoT to use and manage every connected thing. They will have to learn how to build enterprise software for things versus people.

    How the industrial IoT will shape the future

    The industrial IoT is all about value creation: increased profitability, revenue, efficiency, and reliability. It starts with the target of safe, stable operations and meeting environmental regulations, translating to greater financial results and profitability.

    But there’s more to the big picture of the IoT than that. Building the next generation of software for things is a worthy goal, with potential results such as continually improving enterprise efficiency and public safety, driving down costs, decreasing environmental impacts, boosting educational outcomes and more. Companies like GE, Oomnitza and Bosch are investing significant amounts of money in the ability to connect, collect data, and learn from their machines.

    The IoT and the next generation of enterprise software will have big economic impacts as well. The cost savings and productivity gains generated through “smart” thing monitoring and adaptation are projected to create $1.1 trillion to $2.5 trillion in value in the health care sector, $2.3 trillion $11.6 trillion in global manufacturing, and $500 billion $757 billion in municipal energy and service provision over the next decade. The total global impact of IoT technologies could generate anywhere from $2.7 trillion to $14.4 trillion in value by 2025.

    Author: Timothy Chou

    Source: Information-management

  • Gaining control of big data with the help of NVMe

    Gaining control of big data with the help of NVMe

    Every day there is an unfathomable amount of data, nearly 2.5 quintillion bytes, being generated all around us. Part of the data being created we see every day, such as pictures and videos on our phones, social media posts, banking and other apps.

    In addition to this, there is data being generated behind the scenes by ubiquitous sensors and algorithms, whether that’s to process quicker transactions, gain real-time insights, crunch big data sets or to simply meet customer expectations. Traditional storage architectures are struggling to keep up with all this data creation, leading IT teams to investigate new solutions to keep ahead and take advantage of the data boom.

    Some of the main challenges are understanding performance, removing data throughput bottlenecks and being able to plan for future capacity. Architecture can often lock businesses in to legacy solutions, and performance needs can vary and change as data sets grow.

    Architectures designed and built around NVMe(non-volatile memory express) can provide the perfect balance, particularly for data-intensive applications that demand fast performance. This is extremely important for organizations that are dependent on speed, accuracy and real-time data insights.

    Industries such as healthcare, autonomous vehicles, artificial intelligence(AI)/machine learning(ML) and Genomics are at the forefront of the transition to high performance NVMe storage solutions that deliver fast data access for high performance computing systems that drive new research and innovations.

    Genomics

    With traditional storage architectures, detailed genome analysis can take upwards of five days to complete, which makes sense considering an initial analysis of one person’s genome produces approximately 300GB - 1TB of data, and a single round of secondary analysis on just one person’s genome can require upwards of 500TB storage capacity. However, with an NVMe solution implemented it’s possible to get results in just one day.

    In a typical study, genome research and life sciences companies need to process, compare and analyze the genomes of between 1,000 and 5,000 people per study. This is a huge amount of data to store, but it’s imperative that it’s done. These studies are working toward revolutionary scientific and medical advances, looking to personalize medicine and provide advanced cancer treatments. This is only now becoming possible thanks to the speed that NVMe enables researchers to explore and analyze the human genome.

    Autonomous vehicles

    A growing trend in the tech industry is the one of autonomous vehicles. Self-driving cars are the next big thing, and various companies are working tirelessly to perfect the idea. In order to function properly, these vehicles need very fast storage to accelerate the applications and data that ‘drive’ autonomous vehicle development. Core requirements for autonomous vehicle storage include:

    • Must have a high capacity in a small form factor
    • Must be able to accept input data from cameras and sensors at “line rate” – AKA have extremely high throughput and low latency
    • Must be robust and survive media or hardware failures
    • Must be “green” and have minimal power footprint
    • Must be easily removable and reusable
    • Must use simple but robust networking

    What kind of storage meets all these requirements? That’s right – NVMe.

    Artificial Intelligence

    Artificial Intelligence (AI) is gaining a lot of traction in a variety of industries varying from financial to manufacturing, and beyond. In financial, AI does things like predict investment trends. In manufacturing, AI-based image recognition software checks for defects during product assembly. Wherever it’s used, AI needs a high level of computing power, coupled with a high-performance and low-latency architecture in order to enable parallel processing power of data in real-time.

    Once again, NVMe steps up to the plate, providing the speed and processing power that is critical during training and inference. Without NVMe to prevent bottlenecks and latency issues, these stages can take much, much longer. Which, in turn, can lead to the temptation to take shortcuts, causing software to malfunction or make incorrect decisions down the line.

    The rapid increase of data creation has put traditional storage architectures under high pressure due to its lack of scalability and flexibility, both of which are required to fulfill future capacity and performance requirements. This is where NVMe comes in, breaking the barriers of existing designs by offerings unanticipated density and performance. The breakthroughs that NVMe is able to offer contain the requirements needed to help manage and maintain the data boom.

    Author: Ron Herrmann

    Source: Dataversity

     

  • Gartner: 5 cool vendors in data science and machine learning

    data scienceResearch firm Gartner has identified five "cool vendors" in the data science and machine learning space, identifying the features that make their products especially unique or useful. The report, "5 Cool Vendors in Data Science and Machine Learning" was written by analysts Peter Krensky, Svetlana Sicular, Jim Hare, Erick Brethenoux and Austin Kronz. Here are the highlights of what they had to say about each vendor.

    DimensionalMechanics

    Bellevue, Washington
    www.dimensionalmechanics.com
    “DimensionalMechanics has built a data science platform that breaks from market traditions; where more conventional vendors have developed work flow-based or notebook-based data science environments, DimensionalMechanics has opted for a “data-science metalanguage,” Erick Brethenoux writes. “In effect, given the existing use cases the company has handled so far, its NeoPulse Framework 2.0 acts as an “AutoDL” (Auto-Deep Learning) platform. This makes new algorithms and approaches to unusual types of data (such as images, videos and sounds) more accessible and deployable.”

    Immuta

    College Park, Maryland
    www.immuta.com
    “Immuta offers a dedicated data access and management platform for the development of machine learning and other advanced analytics, and the automation of policy enforcement,” Peter Krensky and Jim Hare write. “The product serves as a control layer to rapidly connect and control access between myriad data sources and the heterogeneous array of data science tools without the need to move or copy data. This approach addresses the market expectation that platforms supporting data science will be highly flexible and extensible to the data portfolio and toolkit of a user’s choosing.”

    Indico

    Boston, Massachusetts
    www.indico.io
    “Indico offers a group of products with a highly accessible set of functionality for exploring and modeling unstructured data and automating processes,” according to Peter Krensky and Austin Kronz. “The offering can be described as a citizen data science toolkit for applying deep learning to text, images and document-based data. Indico’s approach makes deep learning a practical solution for subject matter experts (SMEs) facing unstructured content challenges. This is ambitious and exciting, as both deep learning and unstructured content analytics are areas where even expert data scientists are still climbing the learning curve.”

     

    Octopai

    Rosh HaAyin, Israel & New York, New York
    www.octopai.com
    “Octopai solves a foundational problem for data-driven organizations — enabling data science teams and citizen data scientists to quickly find the data, establish trust in data sources and achieve transparency of data lineage through automation,” explains Svetlana Sicular. “It connects the dots of complex data pipelines by using machine learning and pattern analysis to determine the relationships among different data elements, the context in which the data was created, and the data’s prior uses and transformations. Such access to more diverse, transparent and trustworthy data leads to better quality analytics and machine learning.”

     

    ParallelM

    Tel Aviv, Israel & Sunnyvale, California
    www.parallelm.com
    “ParallelM is one of the first software platforms principally focused on the data science operationalization process,” Erick Brethenoux writes. “The focus of data science teams has traditionally been on developing analytical assets, while dealing with the operationalization of these assets has been an afterthought. Deploying analytical assets within operational processes in a repeatable, manageable, secure and traceable manner requires more than a set of APIs and a cloud service; a model that has been scored (executed) has not necessarily been managed. ParallelM’s success and the general development of operationalization functionality within platforms will be an indicator of the success of an entire generation of data scientists.”

     Source: Information Management

     

  • Hadoop: waarvoor dan?

    Hadoop

    Flexibel en schaalbaar managen van big data

    Data-infrastructuur is het belangrijkste orgaan voor het creëren en leveren van goede bedrijfsinzichten . Om te profiteren van de diversiteit aan data die voor handen zijn en om de data-architectuur te moderniseren, zetten veel organisaties Hadoop in. Een Hadoop-gebaseerde omgeving is flexibel en schaalbaar in het managen van big data. Wat is de impact van Hadoop? De Aberdeen Group onderzocht de impact van Hadoop op data, mensen en de performance van bedrijven.

    Nieuwe data uit verschillende bronnen

    Er moet veel data opgevangen, verplaatst, opgeslagen en gearchiveerd worden. Maar bedrijven krijgen nu inzichten vanuit verborgen data buiten de traditionele gestructureerde transactiegegevens. Denk hierbij aan: e-mails, social data, multimedia, GPS-informatie en sensor-informatie. Naast nieuwe databronnen hebben we ook een grote hoeveelheid nieuwe technologieën gekregen om al deze data te beheren en te benutten. Al deze informatie en technologieën zorgen voor een verschuiving binnen big data; van probleem naar kans.

    Wat zijn de voordelen van deze gele olifant (Hadoop)?

    Een grote voorloper van deze big data-kans is de data architectuur Hadoop. Uit dit onderzoek komt naar voren dat bedrijven die Hadoop gebruiken meer gedreven zijn om gebruik te maken van ongestructureerde en semigestructureerd data. Een andere belangrijke trend is dat de mindset van bedrijven verschuift, ze zien data als een strategische aanwinst en als een belangrijk onderdeel van de organisatie.

    De behoefte aan gebruikersbevoegdheid en gebruikerstevredenheid is een reden waarom bedrijven kiezen voor Hadoop. Daarnaast heeft een Hadoop-gebaseerde architectuur twee voordelen met betrekking tot eindgebruikers:

    1. Data-flexibiliteit – Alle data onder één dak, wat zorgt voor een hogere kwaliteit en usability.
    2. Data-elasticiteit – De architectuur is significant flexibeler in het toevoegen van nieuwe databronnen.

    Wat is de impact van Hadoop op uw organisatie?

    Wat kunt u nog meer met Hadoop en hoe kunt u deze data-architectuur het beste inzetten binnen uw databronnen? Lees in dit rapport hoe u nog meer tijd kunt besparen in het analyseren van data en uiteindelijk meer winst kunt behalen door het inzetten van Hadoop.

    Bron: Analyticstoday

  • Harnessing the value of Big Data

    big dataTo stay competitive and grow in today’s market, it becomes necessary for organizations to closely correlate both internal and external data, and draw meaningful insights out of it.

    During the last decade a tremendous amount of data has been produced by internal and external sources in the form of structured, semi-structured and unstructured data. These are large quantities of human or machine generated data produced by heterogeneous sources like social media, field devices, call centers, enterprise applications, point of sale etc., in the form of text, image, video, PDF and more.

    The “Volume”, “Varity” and “Velocity” of data have posed a big challenge to the enterprise. The evolution of “Big Data” technology has been a boon to the enterprise towards effective management of large volumes of structured and unstructured data. Big data analytics is expected to correlate this data and draw meaningful insights out of it.

    However, it has been seen that, a siloed big data initiative has failed to provide ROI to the enterprise. A large volume of unstructured data can be more a burden than a benefit. That is the reason that several organizations struggle to turn data into dollars.

    On the other hand, an immature MDM program limits an organization’s ability to extract meaningful insights from big data. It is therefore of utmost importance for the organization to improve the maturity of the MDM program to harness the value of big data.

    MDM helps towards the effective management of master information coming from big data sources, by standardizing and storing in a central repository that is accessible to business units.

    MDM and Big Data are closely coupled applications complementing each other. There are many ways in which MDM can enhance big data applications, and vice versa. These two types of data pertain to the context offered by big data and the trust provided by master data.

    MDM and big data – A matched pair

    At first hand, it appears that MDM and big data are two mutually exclusive systems with a degree of mismatch. Enterprise MDM initiative is all about solving business issues and improving data trustworthiness through the effective and seamless integration of master information with business processes. Its intent is to create a central trusted repository of structured master information accessible by enterprise applications.

    The big data system deals with large volumes of data coming in unstructured or semi-structured format from heterogeneous sources like social media, field devises, log files and machine generated data.  The big data initiative is intended to support specific analytics tasks within a given span of time after that it is taken down. In Figure 1 we see the characteristics of MDM and big data.  

     

    MDM

    Big Data

    Business Objective

      Provides a single version of trust of Master and Reference information.

      Acts as a system of record / system of reference for enterprise.

      Provides cutting edge analytics and offer a competitive advantage

    Volume of Data and Growth

      Deals with Master Data sets which are smaller in volume

      Grow with relatively slower rate.

      Deal with enormous large volumes of data, so large that current databases struggle to handle it.

      The growth of Big Data is very fast.

    Nature of Data

      Permanent and long lasting

      Ephemeral in nature; disposable if not useful.

    Types of Data (Structure and Data Model)

      It is more towards containing structured data in a definite format with a pre-defined data model.

      Majority of Big Data is either semi-structured or unstructured, lacking in a fixed data model.

    Source of Data

      Oriented around internal enterprise centric data.

      Platform to integrate the data coming from multiple internal and external sources including social media, cloud, mobile, machine generated data etc.

    Orientation

      Supports both analytical and operational environment.

      Fully analytical oriented

    Despite apparent differences there are many ways in which MDM and big data complement each other.

    Big data offers context to MDM

    Big data can act as an external source of master information for the MDM hub and can help enrich internal Master Data in the context of the external world.  MDM can help aggregate the required and useful information coming from big data sources with  internal master records.

    An aggregated view and profile of master information can help  link the customer correctly and in turn help perform effective analytics and campaign. MDM can act as a hub between the system of records and system of engagement.

    However, not all data coming from big data sources will be relevant for MDM. There should be a mechanism to process the unstructured data and distinguish the relevant master information and the associated context. NoSQL offering, Natural Language Processing, and other semantic technologies can be leveraged towards distilling the relevant master information from a pool of unstructured/semi-structured data.

    MDM offers trust to big data

    MDM brings a single integrated view of master and reference information with unique representations for an enterprise. An organization can leverage MDM system to gauge the trustworthiness of data coming from big data sources.

    Dimensional data residing in the MDM system can be leveraged towards linking the facts of big data. Another way is to leverage the MDM data model backbone (optimized for entity resolution) and governance processes to bind big data facts.

    The other MDM processes like data cleansing, standardization, matching and duplicate suspect processing can be additionally leveraged towards increasing the uniqueness and trustworthiness of big data.

    MDM system can support big data by:

    • Holding the “attribute level” data coming from big data sources e.g. social media Ids, alias, device Id, IP address etc.
    • Maintaining the code and mapping of reference information. 
    • Extracting and maintaining the context of transactional data like comments, remarks, conversations, social profile and status etc. 
    • Facilitating entity resolution.
    • Maintaining unique, cleansed golden master records
    • Managing the hierarchies and structure of the information along with linkages and traceability. E.g. linkages of existing customer with his/her Facebook id linked-in Id, blog alias etc.
    • MDM for big data analytics – Key considerations

    Traditional MDM implementation, in many cases, is not sufficient to accommodate big data sources. There is a need for the next generation MDM system to incorporate master information coming from big data systems. An organization needs to take the following points into consideration while defining Next Gen MDM for big data:

    Redefine information strategy and topology

    The overall information strategy needs to get reviewed and redefined in the context of big data and MDM. The impact of changes in topology needs to get accessed thoroughly. It is necessary to define the linkages between these two systems (MDM and big data), and how they operate with internal and external data. For example, the data coming from social media needs to get linked with internal customer and prospect data to provide an integrated view at the enterprise level.

    Information strategy should address following:

    Integration point between MDM and big data - How big data and MDM systems are going to interact with each other.
    Management of master data from different sources - How the master data from internal and external sources is going to be managed.
     Definition and classification of master data - How the master data coming from big data sources gets defined and classified.
    Process of unstructured and semi-structured master data - How master data from big data sources in the form of unstructured and semi-structured data is going to be processed.
    Usage of master data - How the MDM environment are going to support big data analytics and other enterprise applications.

    Revise data architecture and strategy

    The overall data architecture and strategy needs to be revised to accommodate changes with respect to the big data. The MDM data model needs to get enhanced to accommodate big data specific master attributes. For example the data model should accommodate social media and / or IoT specific attributes such as social media Ids, aliases, contacts, preferences, hierarchies, device Ids, device locations, on-off period etc. Data strategy should get defined towards effective storage and management of internal and external master data.

    The revised data architecture strategy should ensure that:

    • The MDM data model accommodates all big data specific master attributes
    • The local and global master data attributes should get classified and managed as per the business needs
    • The data model should have necessary provision to interlink the external (big data specifics) and internal master data elements. The necessary provisions should be made to accommodate code tables and reference data.

     Define advanced data governance and stewardship

     A significant amount of challenges are associated towards governing Master Data coming from big data sources because of the unstructured nature and data flowing from various external sources. The organization needs to define advance policy, processes and stewardship structure that enable big data specifics governance.

    Data governance process for MDM should ensure that:

    Right level of data security, privacy and confidentiality to be maintained for customer and other confidential master data.
    Right level of data integrity to be maintained between internal master data and master data from big data sources. 
    Right level of linkages between reference data and master data to exist.
    Policies and processes need to be redefined/enhanced to support big data and related business transformation rules and control access for data sharing and distribution, establishing the ongoing monitoring and measurement mechanisms and change.
    A dedicated group of big data stewards available for master data review, monitoring and conflict management.

    Enhance integration architecture

     The data integration architecture needs to be enhanced to accommodate the master data coming from big data sources. The MDM hub should have the right level of integration capabilities to integrate with big data using Ids, reference keys and other unique identifiers.

    The unstructured, semi-structured and multi-structured data will get parsed using big data parser in the form of logical data objects. This data will get processed further, matched, merged and get loaded with the appropriate master information to the MDM hub.

    The enhanced integration architecture should ensure that:

    The MDM environment has the ability to parse, transform and integrate the data coming from the big data platform.
    The MDM environment has the intelligence built to analyze the relevance of master data coming from big data environment, and accept or reject accordingly.

    Enhance match and merge engine

     MDM system should enhance the “Match & Merge” engine so that master information coming from big data sources can correctly be identified and integrated into the MDM hub. A blend of probabilistic and deterministic matching algorithm can be adopted.

    For example, the successful identification of the social profile of existing customers and making it interlinked with existing data in the MDM hub. The context of data quality will be more around the information utility for the consumer of the data than objective “quality”.

    The enhanced match and merge engine should ensure that:

    • The master data coming from big data sources get effectively matched with internal data residing in the MDM Hub.
    • The “Duplicate Suspect” master records get identified and processed effectively.
    • The engine should recommend the “Accept”, “Reject”, “Merge” or “Split” of the master records coming from big data sources.

     

    In this competitive era, organizations are striving hard to retain their customers.  It is of utmost importance for an enterprise to keep a global view of customers and understand their needs, preferences and expectations.

    Big data analytics coupled with MDM backbone is going to offer the cutting edge advantage to enterprise towards managing the customer-centric functions and increasing profitability. However, the pairing of MDM and big data is not free of complications. The enterprise needs to work diligently on the interface points so to best harness these two technologies.

    Traditional MDM systems needs to get enhanced to accommodate the information coming from big data sources, and draw a meaningful context. The big data system should leverage MDM backbone to interlink data and draw meaningful insights.

    Bron: Information Management, 2017, Sunjay Kumar

  • Hé Data Scientist! Are you a geek, nerd or suit?

    NerdData scientists are known for their unique skill sets. While thousands of compelling articles have been written about what a data scientist does, most of these articles fall short in examining what happens after you’ve hired a new data scientist to your team. 

    The onboarding process for your data scientist should be based on the skills and areas of improvement you’ve identified for the tasks you want them to complete. Here’s how we do it at Elicit.

    We’ve all seen the data scientist Venn diagrams over the past few years, which includes three high-level types of skills: programming, statistics/modeling, and domain expertise. Some even feature the ever-elusive “unicorn” at the center. 

    While these diagrams provide us with a broad understanding of the skillset required for the role in general, they don’t have enough detail to differentiate data scientists and their roles inside a specific organization. This can lead to poor hires and poor onboarding experiences.

    If the root of what a data scientist does and is capable of is not well understood, then both parties are in for a bad experience. Near the end of 2016, Anand Ramanathan wrote a post that really stuck with me called //medium.com/@anandr42/the-data-science-delusion-7759f4eaac8e" style="box-sizing:border-box;background-color:transparent;color:rgb(204, 51, 51);text-decoration:none">The Data Science Delusion. In it, Ramanathan talks about how within each layer of the data science Venn diagram there are degrees of understanding and capability.

    For example, Ramanathan breaks down the modeling aspect into four quadrants based on modeling difficulty and system complexity, explaining that not every data scientist has to be capable in all four quadrants—that different problems call for different solutions and different skillsets. 

    For example, if I want to understand customer churn, I probably don’t need a deep learning solution. Conversely, if I’m trying to recognize images, a logistic regression probably isn’t going to help me much.

    In short, you want your data scientist to be skilled in the specific areas that role will be responsible for within the context of your business.

    Ramanathan’s article also made me reflect on our data science team here at Elicit. Anytime we want to solve a problem internally or with a client we use our "Geek Nerd Suit" framework to help us organize our thoughts.

    Basically, it states that for any organization to run at optimal speed, the technology (Geek), analytics (Nerd), and business (Suit) functions must be collaborating and making decisions in lockstep. Upon closer inspection, the data science Venn diagram is actually comprised of Geek (programming), Nerd (statistics/modeling), and Suit (domain expertise) skills.

    But those themes are too broad; they still lack the detail needed to differentiate the roles of a data scientist. And we’d heard this from our team internally: in a recent employee survey, the issue of career advancement, and more importantly, skills differentiation, cropped up from our data science team.

    As a leadership team, we always knew the strengths and weaknesses of our team members, but for their own sense of career progression they were asking us to be more specific and transparent about them. This pushed us to go through the exercise of taking a closer look at our own evaluation techniques, and resulted in a list of specific competencies within the Geek, Nerd, and Suit themes. We now use these competencies both to assess new hires and to help them develop in their careers once they’ve joined us.

    For example, under the Suit responsibilities we define a variety of competencies that, amongst other things, include adaptability, business acumen, and communication. Each competency then has explicit sets of criteria associated with them that illustrate a different level of mastery within that competency. 

    We’ve established four levels of differentiation: “entry level,” “intermediate,” “advanced” and “senior.” To illustrate, here’s the distinction between “entry level” and “intermediate” for the Suit: Adaptability competency:

    Entry Level:

    • Analyzes both success and failures for clues to improvement.
    • Maintains composure during client meetings, remaining cool under pressure and not becoming defensive, even when under criticism.

    Intermediate:

    • Experiments and perseveres to find solutions.
    • Reads situations quickly.
    • Swiftly learns new concepts, skills, and abilities when facing new problems.

    And there are other specific criteria for the “advanced” and “senior” levels as well. 

    This led us to four unique data science titles—Data Scientist I, II, and III, as well as Senior Data Scientist, with the latter title still being explored for further differentiation. 

    The Geek Nerd Suit framework, and the definitions of the competencies within them, gives us clear, explicit criteria for assessing a new hire’s skillset in the three critical dimensions that are required for a data scientist to be successful.

    In Part 2, I’ll discuss what we specifically do within the Geek Nerd Suit framework to onboard a new hire once they’ve joined us—how we begin to groom the elusive unicorn. 

    Source: Information Management

    Author: Liam Hanham

  • Hoe werkt augmented intelligence?

    artificial-intelligenceComputers en apparaten die met ons meedenken zijn al lang geen sciencefiction meer. Artificial intelligence (AI) is terug te vinden in wasmachines die hun programma aanpassen aan de hoeveelheid was en computerspellen die zich aanpassen aan het niveau van de spelers. Hoe kunnen computers mensen helpen slimmer te beslissen? Deze uitgebreide whitepaper beschrijft welke modellen in het analyseplatform HPE IDOL worden toegepast.

    Mathematische modellen zorgen voor menselijke maat

    Processors kunnen in een oogwenk een berekening uitvoeren waar mensen weken tot maanden mee bezig zouden zijn. Daarom zijn computers betere schakers dan mensen, maar slechter in poker waarin de menselijke maat een grotere rol speelt. Hoe zorgt een zoek- en analyseplatform ervoor dat er meer ‘mens’ in de analyse terechtkomt? Dat wordt gerealiseerd door gebruik te maken van verschillende mathematische modellen.

    Analyses voor tekst, geluid, beeld en gezichten

    De kunst is om uit data actiegerichte informatie te verkrijgen. Dat lukt door patroonherkenning in te zetten op verschillende datasets. Daarnaast spelen classificatie, clustering en analyse een grote rol bij het verkrijgen van de juiste inzichten. Niet alleen teksten worden geanalyseerd, steeds vaker worden ook geluidsbestanden en beelden, objecten en gezichten geanalyseerd.

    Artificial intelligence helpt de mens

    De whitepaper beschrijft uitvoerig hoe patronen worden gevonden in tekst, audio en beelden. Hoe snapt een computer dat de video die hij analyseert over een mens gaat? Hoe wordt van platte beelden een geometrisch 3d-beeld gemaakt en hoe beslist een computer wat hij ziet? Denk bijvoorbeeld aan een geautomatiseerd seintje naar de controlekamer als het te druk is op een tribune of een file ontstaat. Hoe helpen theoretische modellen computers als mensen waarnemen en onze beslissingen ondersteunen? Dat en meer leest u in de whitepaper Augmented intelligence Helping humans make smarter decisions. Zie hiervoor AnalyticsToday

    Analyticstoday.nl, 12 oktober 2016

  • How Nike And Under Armour Became Big Data Businesses

    960x0Like the Yankees vs the Mets, Arsenal vs Tottenham, or Michigan vs Ohio State, Nike and Under Armour are some of the biggest rivals in sports.
     
    But the ways in which they compete — and will ultimately win or lose — are changing.
     
    Nike and Under Armour are both companies selling physical sports apparel and accessories products, yet both are investing heavily in apps, wearables, and big data.  Both are looking to go beyond physical products and create lifestyle brands athletes don’t want to run without.
     
    Nike
     
    Nike is the world leader in multiple athletic shoe categories and holds an overall leadership position in the global sports apparel market. It also boasts a strong commitment to technology, in design, manufacturing, marketing, and retailing.
     
    It has 13 different lines, in more than 180 countries, but how it segments and serves those markets is its real differentiator. Nike calls it “category offense,” and divides the world into sporting endeavors rather than just geography. The theory is that people who play golf, for example, have more in common than people who simply happen to live near one another.
     
    And that philosophy has worked, with sales reportedly rising more than 70% since the company shifted to this strategy in 2008. This retail and marketing strategy is largely driven by big data.
     
    Another place the company has invested big in data is with wearables and technology.  Although it discontinued its own FuelBand fitness wearable in 2014, Nike continues to integrate with many other brands of wearables including Apple which has recently announced the Apple Watch Nike+.How Nike And Under Armour Became Big Data Businesses
     
    But the company clearly has big plans for its big data as well. In a 2015 call with investors about Nike’s partnership with the NBA, Nike CEO Mark Parker said, “I’ve talked with commissioner Adam Silver about our role enriching the fan experience. What can we do to digitally connect the fan to the action they see on the court? How can we learn more about the athlete, real-time?”
     
    Under Armour
     
    Upstart Under Armour is betting heavily that big data will help it overtake Nike. The company has recently invested $710 million in acquiring three fitness app companies, including MyFitnessPal, and their combined community of more than 120 million athletes — and their data.
     
    While it’s clear that both Under Armour and Nike see themselves as lifestyle brands more than simply apparel brands, the question is how this shift will play out.
     
    Under Armour CEO Kevin Plank has explained that, along with a partnership with a wearables company, these acquisitions will drive a strategy that puts Under Armour directly in the path of where big data is headed: wearable tech that goes way beyond watches
     
    In the not-too-distant future, wearables won’t just refer to bracelets or sensors you clip on your shoes, but rather apparel with sensors built in that can report more data more accurately about your movements, your performance, your route and location, and more.
     
    “At the end of the day we kept coming back to the same thing. This will help drive our core business,” Plank said in a call with investors. “Brands that do not evolve and offer the consumer something more than a product will be hard-pressed to compete in 2015 and beyond.”
     
    The company plans to provide a full suite of activity and nutritional tracking and expertise in order to help athletes improve, with the assumption that athletes who are improving buy more gear.
     
    If it has any chance of unseating Nike, Under Armour has to innovate, and that seems to be exactly where this company is planning to go. But it will have to connect its data to its innovations lab and ultimately to the products it sells for this investment to pay off.
     
     
    Source: forbes.com, November 15, 2016
  • How patent data can provide intelligence for other markets

    How patent data can provide intelligence for other markets

    Patents are an interesting phenomenon. Believe it or not, the number one reason why patent systems exist is to promote innovation and the sharing of ideas. Simply put, a patent is really a trade. A government has the ability to give a limited monopoly to an inventor. In exchange for this exclusivity, the inventor provides a detailed description of their invention. The application for a patent needs to include enough detail about the technology that a peer in the field could pick it up and understand how to make or practice that invention. Next, this description gets published to the world so others can read and learn from it. That's the exchange. Disclose how your invention is made, in enough detail to replicate, and you can get a patent.

    It gets really interesting when you consider that the patent carries additional metadata with it. This additional data is above and beyond the technical description of the invention. Included in this data are the inventor names, addresses, the companies they work for (the patent owner), the date of the patent filing, a list of related patents/applications, and more. This metadata and the technical description of the invention make up an amazing set of data identifying research and development activity across the world. Also, since patents are issued by governments, they are inherently geographic. This means that an inventor has to apply for a patent in every country where they want protection. Add the fact that patents are quite expensive, and we are left with a set of ideas that have at least passed some minimal value threshold. That willingness to spend money signals a value in the technology, specifically a value of that technology in the country/market where each patent is filed. In many ways, if you want to analyze a technology space, patent data can be better than analyzing products. The technology is described in substantial detail and, in many cases, identifies tech that has not hit the market yet.

    Breaking down a patent dataset

    Black Hills IP has amassed a dataset of over 100 million patent and patent application records from across the world. We not only use data published by patent offices, but we also run proprietary algorithms on that data to create additional patent records and metadata. This means we have billions of data points to use in analysis, and likely have the largest consolidated patent dataset in the world. In the artificial intelligence (AI) space alone, we have an identified set of between one hundred thousand and two hundred thousand patent records. This has been fertile ground for analysis and insight.

    Breaking down this dataset, we can see ownership and trends around foundational and implementational technologies. For example, several of the large US players are covering their bases with patent filings in multiple jurisdictions, including China. Interestingly enough, the inverse is not necessarily shown. Many of the inventions from Chinese companies have their patent filings (and thus protection) limited to just China. While many large US companies in the field tend to have their patent portfolios split roughly 50/50 between US and international patent filings, the top players in China have a combined distribution with well over 75% domestic and only the remainder in international jurisdictions. This means that there is a plethora of technology protected only within the borders of China, and the implications could be significant given the push for AI technology development in China and the wealth of resources available to fuel that development.

    So what?

    Why does all this matter? When patents are filed in a single jurisdiction only, they are visible to the world and open the door to free use outside the country of filing. In years past, we have seen Chinese companies repurpose Silicon Valley technologies for the China domestic market. With more of a historical patent thicket in the US than in China, this strategy made sense. When development and patent protection have been strong in the US, repurposing that technology in a less protected Chinese market is not only possible, but a viable business model. What we’re seeing now in the emerging field of AI technology, specifically the implementation of such technologies, is the pendulum starting to swing back.

    In an interesting reversal of roles, the publication of Chinese patents on technologies not concurrently protected in the US has the potential to drive a copying of Chinese-originated AI tech in the US market. We may see some rapid growth of implementational AI technologies in the US or other western countries, fueled by Chinese development and domestic-focused IP strategy. Of course, there are many other insights to glean out of this wealth of patent data. The use of these patent analytics in the technology space will only increase as the patent offices across the world improve their data reporting and availability. Thanks to advances by some of the major patent offices, visibility into new developments is getting easier and easier. Technology and business intelligence programs stand to gain substantially from the insights hidden in IP data.

    Author: Tom Marlow

    Source: Oracle

  • How the data-based gig economy affects all markets

    How the data-based gig economy affects all markets

    Data is infinite. Any organization that wants to grow at a meaningful pace would be wise to learn how to leverage the vast amount of data available to drive growth. Just ask the top five companies in the world today: Apple, Amazon, Google, Facebook, and Microsoft. All these technology giants either process or produce data.

    Companies like these with massive stockpiles of data often find themselves surrounded by other businesses that use that data to operate.Salesforce is a great example: Each year at its Dreamforce conference in San Francisco, hundreds of thousands of attendees and millions of viewers worldwide prove just how many jobs the platform has created.

    Other companies are using vast amounts of information from associated companies to enhance their own data or to provide solutions for their clients to do so. When Microsoft acquired LinkedIn, for instance, it acquired 500 million user profiles and all of the data that each profile has generated on the platform. All ripe for analysis.

    With so much growth evolving from a seemingly infinite ocean of data, tomorrow’s leading companies will be those that understand how to capture, connect, and leverage information into actionable insight. Unless they’re already on the top 10 list of the largest organizations, the problem most companies face is a shortage of highly skilled talent that can do it for them. Enter the data scientist.

    More data, more analysts

    The sheer amount of data at our fingertips isn’t the only thing that’s growing. According to an Evans Data report, more than 6 million developers across the world are officially involved in analyzing big data. Even traditionally brick-and-mortar retail giant Walmart plans to hire 2,000 tech experts, including data scientists, for that specific purpose.

    Companies old and new learned long ago that data analysis is vital to understanding customers’ behavior. Sophisticated data analytics can reveal when customers are likely to buy certain products and what marketing methods would be effective in certain subgroups of their customer base.

    Outside of traditional corporations, companies in the gig economy are relying even more on data to utilize their resources and workforce more efficiently. For example, Uber deploys real-time user data to determine how many drivers are on the road at any given time, where more drivers are needed, and when to enact a surge charge to attract more drivers.

    Data scientists are in demand and being hired by the thousands. Some of the most skilled data scientists are going the freelance route because their expertise allows them to choose more flexible work styles. But how can data scientists who aren’t interested in becoming full-time, in-house hires ensure that the companies for which they freelance are ready for their help?

    The data-based gig economy

    Gartner reports that the number of freelance data scientists will grow five times faster than that of traditionally employed ones by next year. The data-based gig economy can offer access to top talent on flexible schedules. But before data scientists sign on for a project, they should check to see that companies are prepared in the following areas:

    • Companies need to understand their data before they decide what to do with it. That data could include inventory, peak store hours, customer data, or other health metrics.
    • Next, businesses should have streamlined the way they collect and store their data to make it easy to analyze. Use of a CRM platform is a good indicator of preparedness at this stage.
    • Finally, companies need to be able to act on the insights they glean. After freelancers are able to use organizations’ collected and organized data to find valuable connections and actionable insights, those organizations should have a process for implementing the discoveries.

    Today’s organizations need data in order to be successful, and they need data scientists to make use of that data. In order for both parties to thrive in this era, companies need to have the right strategies in place before they invest in freelance talent. When they do, freelance data scientists will have the opportunity to gather critical knowledge from the data and use their talents to drive innovation and success.

    Author: Marcus Sawyerr

    Source: Insidebigdata

  • ING en TU Delft slaan handen ineen met nieuw AI lab

    ING en TU Delft slaan handen ineen met nieuw AI lab

    ING en de TU Delft bundelen hun kennis en expertise op het gebied van artificial intelligence (AI) binnen de financiële sector in het nieuwe AI for FinTech Lab (AFL). Het doel van de samenwerking met het AFL is om met kunstmatige intelligentie-technologie de effectiviteit en doelmatigheid van data- en software-analyse te verbeteren.

    ING en de TU Delft werken al langere tijd samen op het gebied van software onderzoek en ontwikkeling. Binnen het nieuwe AFL zullen onderzoekers en studenten van de TU Delft onderzoek doen naar de ontwikkeling van software voor de financiële sector, waaronder autonome software en systemen voor data-analyse en data-integratie. Binnen deze samenwerking biedt het ING een onmisbare IT-infrastructuur, een ambitieuze organisatiestructuur voor software-ontwikkeling, en een leidende positie op het gebied van data fluency en analytics delivery.

    Gevalideerde oplossingen

    Volgens Arie van Deursen, hoogleraar software engineering aan de TU Delft en wetenschappelijk directeur van het AFL, is het AFL voor de TU Delft een logische volgende stap in de samenwerking met ING. ''Het biedt de kans om nieuwe theorieën, methoden en tools op het gebied van kunstmatige intelligentie te ontwikkelen, en om nieuw talent aan ons te binden. Wij verwachten dat de samenwerking binnen het AFL niet alleen zal leiden tot baanbrekende theorieën, maar ook tot gevalideerde oplossingen die breed verspreid kunnen worden.”

    Görkem Köseoğlu, Chief Analytics Officer bij ING: ‘Het gebruik van klantdata biedt grote kansen om betere diensten te ontwikkelen, maar moet tegelijk zorgvuldig worden vormgeven. Voor klanten zijn hun data van groot belang en ING hecht er veel waarde aan dat klanten ons vertrouwen. De samenwerking met de TU Delft is daarom van groot belang om beide doelen te realiseren.’

    Het AFL bevindt zich op twee locaties: de ING campus in Amsterdam en de campus van de TU Delft in Delft. Zo brengt het studenten, software- en data-specialisten, onderzoekers en ondernemers van beide organisaties samen.

    AI for FinTech Lab en ICAI

    Het AFL is deel van het ICAI, het Innovation Center for Artificial Intelligence. Dit is een nationaal netwerk gericht op technologie- en talentontwikkeling tussen kennisinstellingen, industrie en de overheid op het gebied van Artificial Intelligence. ICAI's innovatiestrategie is georganiseerd rond industry labs, onderzoekslabs die meerjarige strategische samenwerkingen met het bedrijfsleven omvatten. Het AFL kan zo nog beter kennis en expertise uitwisselen met andere ICAI partners, zoals bijvoorbeeld Elsevier, Qualcomm, Bosch, Ahold Delhaize, en de Nationale Politie.

    Bron: BI platform

  • Kunstmatige intelligentie leert autorijden met GTA

    Zelfrijdende auto toekomst-geschiedenis

    Wie ooit Grand Theft Auto (GTA) heeft gespeeld, weet dat de game niet is gemaakt om je aan de regels te houden. Toch kan GTA volgens onderzoekers van de Technische Universiteit Darmstadt een kunstmatige intelligentie helpen om te leren door het verkeer te rijden. Dat schrijft het universiteitsmagazine van MIT, Technology Review.

    Onderzoekers gebruiken het spel daarom ook om algoritmes te leren hoe ze zich in het verkeer moeten gedragen. Volgens de universiteit is de realistische wereld van computerspelletjes zoals GTA heel erg geschikt om de echte wereld beter te begrijpen. Virtuele werelden worden al gebruikt om data aan algoritmes te geven, maar door games te gebruiken hoeven die werelden niet specifiek gecreëerd te worden.

    Het leren rijden in Grand Theft Auto werkt ongeveer gelijk als in de echte wereld. Voor zelfrijdende auto’s worden objecten en mensen, zoals voetgangers, gelabeld. Die labels kunnen aan het algoritme, waardoor die in staat is om in zowel de echte wereld als de videogame onderscheid te maken tussen verschillende voorwerpen of medeweggebruikers.

    Het is niet de eerste keer dat kunstmatige intelligentie wordt ingezet om computerspelletjes te spelen. Zo werkte onderzoekers al aan een slimme Mario en wordt Minecraft voor eenzelfde doeleinde gebruikt als GTA. Microsoft gebruikt de virtuele wereld namelijk om personages te leren hoe ze zich door de omgeving moeten manoeuvreren. De kennis die wordt opgedaan kan later gebruikt worden om robots in de echte wereld soortgelijke obstakels te laten overwinnen.

    Bron: numrush.nl, 12 september 2016

     

  • Lessons From The U.S. Election On Big Data And Algorithms

    The failure to accurately predict the outcome of the elections has caused some backlash against big data and algorithms. This is misguided. The real issue is failure to build unbiased models that will identify trends that do not fit neatly into our present understanding. This is one of the most urgent challenges for big data, advanced analytics and algorithms.  When speaking with retailers on this subject I focus on two important considerations.  The first is that convergence of what we believe to be true and what is actually true is getting smaller.

    things-you-know-300x179

    This is because people, consumers, have more personal control than ever before.  They source opinions from the web, social media, groups and associations that in the past where not available to them.  For retailers this is critical because the historical view that the merchandising or marketing group holds about consumers is likely growing increasingly out of date.  Yet well meaning business people performing these tasks continue to disregard indicators and repeat the same actions.  Before consumers had so many options this was not a huge problem since change happened more slowly.  Today if you fail to catch a trend there are tens or hundreds of other companies out there ready to capitalize on the opportunity.  While it is difficult to accept, business people must learn a new skill, leveraging analytics to improve their instincts.

    The second is closely related to the first but with an important distinction; go where the data leads. I describe this as the KISS that connects big data to decisions.
    The KISS is about extracting knowledge, testing innovations, developing strategies, and doing all this at high speed. The KISS is what allows the organization to safely travel down the path of discovery – going where the data leads – without falling down a rabbit hole.
    KISS1-300x164
    Getting back to the election prognosticators, there were a few that did identify the trend.  They were repeatedly laughed at and disregarded. This is the foundation of the problem, organizations must foster environments where new ideas are embraced and safely explored.  This is how we will grow the convergence of things we know. 
     
    Source: Gartner, November 10, 2016
  • Localization uses Big Data to Drive Big Business

    There’s growing interest in using big data for business localization now, although the use of customer data for optimal orientation of busi

    localization

    ness locations and promotions has been around for at least a decade.

    There’s growing interest in using big data for business localization now, although the use of customer data for optimal orientation of business locations and promotions has been around for at least a decade.

    In 2006, the Harvard Business Review declared the endof big-box retail standardization in favor of catering to customers’ local and regional tastes, fostering innovation, and – not incidentally – making it harder for competitors to copy their store formats by changing up the one-size-fits-all approach. A decade later, analytics are affordable for businesses of all sizes, giving smaller players in a variety of industries the ability to localize as well.

    An example of early localization of items sold comes from Macy’s. Executive search firm Caldwell Partners describes the department-store chain’s vast localization project, which began in the mid-2000s to differentiate store inventories for customer preferences, beginning in markets such as Miami, Columbus, and Atlanta. This strategy has helped Macy’s remain profitable despite ongoing major declines in department-store sales in recent years.

    Localization for stronger consumer appeal, better product offerings

    In hospitality, hotel chains now use localization strategies to compete with locally owned boutique hotels and with Airbnb rentals that promise a “live like a local” experience.

    Visual News reports that Millennials’ tastes and preferences are driving this trend. These younger travel enthusiasts want a unique experience at each destination, even if they’re staying in properties owned by the same hotel brand.

    Hospitality Technology notes that today’s customer profile data gives hotel chains a “360 degree view of customer spending behavior across industries, channels, and over time,” for more precise location orientation and targeted marketing.

    In fact, any consumer-facing business can benefit from using local-market data. GIS firm ESRI has described how individual bank branches can orient their loan offerings to match the needs and risk profiles of customers in the immediate area. Other elements that can be localized to suit area customers’ tastes and spending power include product prices, menu items, location hours, staffing levels, décor, and product displays.

    Localization for more effective marketing

    Outside the store itself, localization is a powerful tool for improving the return on marketing. By using detailed data about local customer behavior, retailers, restaurants and other businesses can move from overly broad promotions to segmented offers that closely align with each segment’s preferences.

    In some cases, this type of marketing localization can reduce expenses (for example, by lowering the total number of direct-mail pieces required for a campaign) while generating higher redemption rates.

    Localization of marketing efforts goes beyond cost savings to the establishment of customer loyalty and competitive advantage. Study after study shows that consumers expect and respond well to offers based on their preferences, but companies have been slow to provide what customers want.

    An international study reported by Retailing Today in June found that 78% of consumers make repeat purchases when they receive a personalized promotion, and 74% buy something new. Despite this, the study found that less than 30% of the companies surveyed were investing heavily in personalization.

    A similar 2015 study focusing on North American consumers, described by eMarketer, found that more than half of the consumers surveyed wanted promotions tailored to their product preferences, age range, personal style, and geographic location. That study found that although 71% of the regional retailers in the survey say they localize and personalize promotional emails, half the consumers said they got promotional emails that didn’t align with their preferences.

    Clearly, there’s room for improvement in the execution of localized marketing, and businesses that get it right will have an advantage with customers whose expectations are going unmet right now.

    Smart localization and orientation involve understanding the available data and knowing how to use it in cost-effective ways to give customers the information they want. It also involves rethinking the way businesses and consumers interact, and the role geography plays in business.

    Localization and careful audience targeting may be the keys to business survival. A 2013 Forrester report proclaimed that in the digital age, “the only sustainable competitive advantage is knowledge of and engagement with customers.”

    With so much power of choice in the hands of consumers, it’s up to retailers, restaurants and other businesses to earn their loyalty by delivering what they want in real time, no matter where they’re located.

    Author: Charles Hogan

    Charles Hogan is co-founder and CEO at Tranzlogic. He has over 20 years of experience in fintech, data analytics, retail services and payment processing industries. Follow on twitter @Tranzlogic)

  • Machine learning, AI, and the increasing attention for data quality

    Machine learning, AI, and the increasing attention for data quality

    Data quality has been going through a renaissance recently.

    As a growing number of organizations increase efforts to transition computing infrastructure to the cloud and invest in cutting-edge machine learning and AI initiatives, they are finding that the main barrier to success is the quality of their data.

    The old saying “garbage in, garbage out” has never been more relevant. With the speed and scale of today’s analytics workloads and the businesses that they support, the costs associated with poor data quality are also higher than ever.

    This is reflected in a massive uptick in media coverage on the topic. Over the past few months, data quality has been the focus of feature articles in The Wall Street Journal, Forbes, Harvard Business Review, MIT Sloan Management Review and others. The common theme is that the success of machine learning and AI is completely dependent on data quality. A quote that summarizes this dependency very well is this one by Thomas Redman: ''If your data is bad, your machine learning tools are useless.''

    The development of new approaches towards data quality

    The need to accelerate data quality assessment, remediation and monitoring has never been more critical for organizations and they are finding that the traditional approaches to data quality don’t provide the speed, scale and agility required by today’s businesses.

    For this reason, highly rated data preparation business Trifacta recently announced an expansion into data quality and unveiled two major new platform capabilities with active profiling and smart cleaning. This is the first time Trifacta has expanded our focus beyond data preparation. By adding new data quality functionality, the business aims to gain capabilities to handle a wider set of data management tasks as part of a modern DataOps platform.

    Legacy approaches to data quality involve many manual, disparate activities as part of a broader process. Dedicated data quality teams, often disconnected from the business context of the data they are working with, manage the process of profiling, fixing and continually monitoring data quality in operational workflows. Each step must be managed in a completely separate interface. It’s hard to iteratively move back-and-forth between steps such as profiling and remediation. Worst of all, the individuals doing the work of managing data quality often don’t have the appropriate context for the data to make informed decisions when business rules change or new situations arise.

    Trifacta uses interactive visualizations and machine intelligence guides help users by highlighting data quality issues and providing intelligent suggestions on how to address them. Profiling, user interaction, intelligent suggestions, and guided decision-making are all interconnected and drive the other. Users can seamlessly transition back-and-forth between steps to ensure their work is correct. This guided approach lowers the barriers to users and helps to democratize the work beyond siloed data quality teams, allowing those with the business context to own and deliver quality outputs with greater efficiency to downstream analytics initiatives.

    New data platform capabilities like this are only a first (albeit significant) step into data quality. Keep your eyes open and expect more developments towards data quality in the near future!

    Author: Will Davis

    Source: Trifacta

  • Magic Quadrant: 17 top data science and machine learning platforms

    RapidminerRapidMiner, TIBCO Software, SAS and KNIME are among the leading providers of data science and machine learning products, according to the latest Gartner Magic Quadrant report.

    About this Magic Quadrant report

    Gartner Inc. has released its "Magic Quadrant for Data Science and Machine Learning Platforms," which looks at software products that enable expert data scientists, citizen data scientists and application developers to create, deploy and manage their own advanced analytic models. According to Gartner analysts and report authors Carlie Idoine, Peter Krensky, Erick Brethenoux and Alexander Linden, "We define a data science platform as: A cohesive software application that offers a mixture of basic building blocks essential for creating all kinds of data science solutions, and for incorporating those solutions into business processes, surrounding infrastructure and products." Here are the top performers, categorized as Leaders, Challengers, Visionaries or Niche Players.

    Leaders

    According to the Gartner analysts, “Leaders have a strong presence and significant mind share in the data science and ML market. They demonstrate strength in depth and breadth across the full data exploration, model development and operationalization process. While providing outstanding service and support, Leaders are also nimble in responding to rapidly changing market conditions. The number of expert and citizen data scientists using Leaders’ platforms is significant and growing. Leaders are in the strongest position to influence the market’s growth and direction. They address the majority of industries, geographies, data domains and use cases, and therefore have a solid understanding of, and strategy for, this market.” 

    RapidMiner

    RapidMiner is based in Boston, MA. Its platform includes RapidMiner Studio, RapidMiner Server, RapidMiner Cloud, RapidMiner Real-Time Scoring and RapidMiner Radoop. “RapidMiner remains a Leader by striking a good balance between ease of use and data science sophistication,” the Gartner analysts say. “Its platform’s approachability is praised by citizen data scientists, while the richness of its core data science functionality, including its openness to open-source code and functionality, make it appealing to experienced data scientists, too.”

    Tomorrow nr 3 of Data Science platform suplliers

    Source: Information Management

    Author: David Weldon

  • NLP: the booming technology for data-driven businesses

    NLP: the booming technology for data-driven businesses

    Becoming data-driven is the new mantra in business. The consensus is that the most value can be achieved by getting the data in the hands of decision makers. However, this is only true if data consumers know how to handle data. Getting managers to think like data scientists is one way to approach this challenge, another is to make data more approachable and more human. As such, it’s no surprise that natural language processing (NLP) is the talk of the data-driven town.

    Emergence of a new type of language

      When considering the next generation of digital user interaction NLP might not be the first thing that comes to mind, it’s by no means a new concept or technology. As referenced on Wikipedia, NLP is a subfield of computer science, information engineering, andartificial intelligence (AI) concerned with the interactions between computers and human (natural) languages. In particular: how to program computers to process and analyze large amounts of natural language data.

    Developments in the related disciplines of machine learning (ML) and AI are propelling the use of NLP forward. Industry leaders like Gartner have claimed conversational analytics as an emerging paradigm. This shift enables business professionals to explore their data, generate queries, and receive and act on insights using natural language. This can through be voice or text, through mobile devices, or through personal assistants for example.

    Becoming fluent in NLP

    When facing strategic obstacles that can hinder innovation and muddy the decision-making process, such as organizational silos, Deloitte found that businesses with leaders who embody the characteristics of the Industry 4.0 persona “the Data-Driven Decisive” are overcoming these roadblocks through a methodical, data-driven approach and are often bolder in their decisions.

    In order to effectively apply data throughout an organization, companies need to provide employees with a base-level understanding of the importance and role of data within their business. Looking at the overwhelming demand to be met here, this challenge needs to be approached from both ends through education and tools.

    Teaching employees how they can use data and which questions to ask will go some distance to establishing a group of data-capable individuals within the workforce. Giving them the effective media through which they can consume data exponentially increases the number of people that can manipulate, analyze, and visualize data in a way that allows them to make better decisions.

    The aim is not to convert everyone into a data scientist. Data specialists will still be needed to do more forward-looking number crunching, and both groups might yield different solutions. Natural language processing as used in Tableau’s Ask Data solution mainly aims to lower the bar for all the non-data experts to use data to improve the results of their day-to-day jobs.

    Deciphering ambiguity

    Inference remains an area where things can get a bit complicated. NLP is good at interpreting language and spotting ambiguity in elements when there isn’t enough clarity in data sets.

    A business user enters a search term in Ask Data and sees the answer being presented in the most insightful way. But pulling out the right elements from the right tables and variables, the actual SQL query under the hood, is hidden from the user’s view.

    NLP is good at leaving no stone unturned when it comes to solving problems. But NLP alone is not the best interface when the user doesn’t know enough about what they’re looking for, can’t articulate a question, or would rather choose from a list of options. For example, a user might not know what the name of a particular product is, but if they click to view a menu with a list of products to filter through, they’ll be able to make an easier choice. This is where mixed-modality systems like Ask Data shine.

    NLP is still not the most effective at resolving a query when there’s lots of room for interpretation—especially when it hasn’t seen the specific query before. For example, if a colleague were to ask to “email Frank,” then we as humans tend to know to look for the Franks we know professionally, not the Franks in our family or circle of friends. As humans, we have the advantage of tapping our memory to inform the context of a request based on who is making the request. NLP still has some catching up to do in this department.

    Enabling a culture of data

    For companies looking to start talking with their data, the most important first step is to enable a culture of data. It is also important to pay attention to the needs and wants of the people that are required to handle data.

    As with a lot of other implementations, starting with a small team and then expanding tends to be a successful approach. By equipping your team with the tools needed to explore data and ask questions, the team will then get exposed to the new ways data can be accessed. It’s also vital to make them aware of the growing global community resource of data explorers that function as a sharing economy of tips and tricks.

    Lastly, as functionality is still very much developing, providing insight to vendors to inform product updates and new capabilities is invaluable. Endless chatter will get you nowhere. Meaningful conversations, with data, are the ones that count.

    Author: Ryan Atallah

    Source: Tableau

  • Noord-Nederland bundelt krachten in unieke opleiding Data Science

    HanzeHogeschool logo-300x169Op 7 maart start de opleiding Data Science in Noord-Nederland. Om de al maar groeiende hoeveelheid data te managen leidt IT Academy Noord-Nederland professionals uit het Noorden op tot data scientist. Met geaccrediteerde vakken van de Hanzehogeschool Groningen en de Rijksuniversiteit Groningen slaat de opleiding een brug tussen toegepast en wetenschappelijk onderwijs. De opleiding is opgezet in samenwerking met het bedrijfsleven.

    Er liggen steeds meer kansen voor bedrijven en instellingen om met enorme hoeveelheden data op innovatieve wijze nieuwe producten en diensten aan te bieden. Hoe kunnen bedrijven omgaan met deze data en hoe zit het met privacy en het eigendom van data? Het verzamelen van data is stap één, maar het kunnen ordenen en analyseren creëert waarde. Een bekend voorbeeld is Uber die door het gebruik van Big Data een compleet nieuw (disruptive) business model voor de vervoerssector heeft gecreëerd.


    De vraag naar data scientists neemt toe. De opleiding Data Science is de eerste van zijn soort in Noord-Nederland. Het RDW speelde met haar data-intensieve bedrijfsvoering en roep om een opleiding op het gebied van Big Data een cruciale rol in de ontwikkelfase van de opleiding. Om het programma met de juiste elementen te laden bundelde de IT Academy de krachten van de Hanzehogeschool en de RUG. Hoogleraren en docenten van beide instellingen zullen delen van het programma verzorgen. Daarnaast zorgen gastsprekers van andere kennisinstellingen en het bedrijfsleven voor casuïstiek uit de praktijk om de opgedane kennis gelijk toe te passen.

    IT Academy Noord-Nederland
    IT Academy Noord-Nederland biedt state-of-the-art onderwijs, doet onderzoek door middel van open samenwerking tussen bedrijven, kennisinstellingen en organisaties om zo in Noord-Nederland het innovatief vermogen te versterken, werkgelegenheid in ICT te stimuleren en een aantrekkelijke landingsplaats voor talent te zijn. IT Academy Noord-Nederland is een initiatief van de Hanzehogeschool Groningen, Rijksuniversiteit, Samenwerking Noord en IBM Client Innovation Center.

    Source: Groninger krant

  • On-premise or cloud-based? A guide to appropriate data governance

    On-premise or cloud-based? A guide to appropriate data governance

    Data governance involves developing strategies and practices to ensure high-quality data throughout its lifecycle.

    However, besides deciding how to manage data governance, you must choose whether to apply the respective principles in an on-premise setting or the cloud.

    Here are four pointers to help:

    1. Choose on-premise when third-party misconduct is a prevalent concern

    One of the goals of data governance is to determine the best ways to keep data safe. That's why data safety comes into the picture when people choose cloud-based or on-premise solutions. If your company holds sensitive data like health information and you're worried about a third-party not abiding by your data governance policies, an on-premise solution could be right for you.

    Third-party cloud providers must abide by regulations for storing health data, but they still make mistakes. Some companies offer tools that let you determine a cloud company's level of risk and see the safeguards it has in place to prevent data breaches. You may consider using one of those to assess whether third-party misconduct is a valid concern as you strive to maintain data governance best practices.

    One thing to keep in mind is that the shortcomings of third-party companies could cause long-term damage for your company's reputation. For example, in a case where a cloud provider has a misconfigured server that allows a data breach to happen, they're to blame. But, the headlines about the incident will likely primarily feature your brand and may only mention the outside company in a passing sentence.

    If you opt for on-premise data governance, your company alone is in the spotlight if something goes wrong, but it's also possible to exert more control over all facets of data governance to promote consistency. When you need scalability, cloud-based technology typically allows you to ramp up faster, but you shouldn't do that at the expense of a possible third-party blunder.

    2. Select cloud-based data governance if you lack data governance maturity

    Implementing a data governance program is a time-consuming but worthwhile process. A data governance maturity assessment model can be useful for seeing how your company's approach to data governance stacks up to industry-wide best practices. It can also identify gaps to illuminate what has to happen for ongoing progress to occur.

    Using a data governance maturity assessment model can also signal to stakeholders that data governance is a priority within your organization. However, if your assessments show the company has a long way to go before it can adhere to best practices, cloud-based data governance could be the right choice.

    That's because the leading cloud providers have their own in-house data governance strategies in place. They shouldn't replace the ones used in-house at your company, but they could help you fill in the known gaps while improving company-wide data governance.

    3. Go with on-premise if you want ownership

    One of the things that companies often don't like about using a cloud provider for data governance is that they don't have ownership of the software. Instead, they usually enter into a leasing agreement, similarly to leasing an automobile. So, if you want complete control over the software used to manage your data, on-premise is the only possibility which allows that ownership.

    One thing to keep in mind about on-premise data governance is that you are responsible for data security. As such, you must have protocols in place to keep your software updated against the latest security threats.

    Cloud providers usually update their software more frequently than you might in an on-premise scenario. That means you have to be especially proactive about dealing with known security flaws in outdated software. Indeed, on-premise data governance has the benefit of ownership, but your organization has to be ready to accept all the responsibility that option brings.

    4. Know that specialized data governance tools are advantageous in both cases

    You've already learned a few of the pros and cons of on-premise versus cloud-based solutions to meet your data governance requirements. Don't forget that no matter which of those you choose, specialty software can help you get a handle on data access, storage, usage and more. For example, software exists to help companies manage their data lakes whether they are on the premises or in the cloud.

    Those tools can sync with third-party sources of data to allow monitoring of all the data from a single interface. Moreover, they can track metadata changes, allowing users to become more aware of data categorization strategies.

    Regardless of whether you ultimately decide it's best to manage data governance through an on-premise solution or in the cloud, take the necessary time to investigate data governance tools. They could give your company insights that are particularly useful during compliance audits or as your company starts using data in new ways.

    Evaluate the tradeoffs

    As you figure out if it's better to entrust data governance to a cloud company or handle it on-site, don't forget that each option has pros and cons.

    Cloud companies offer convenience, but only if their data governance principles align with your needs. And, if customization is one of your top concerns, on-premise data governance gives you the most flexibility to make tweaks as your company evolves.

    Studying the advantages and disadvantages of these options carefully before making a decision should allow you to get maximally informed about how to accommodate for your company's present and future needs. 

    Author: Kayla Matthews

    Source: Information-management

  • Organizing Big Data by means of using AI

    Artificial IntelligenceNo matter what your professional goals are, the road to success is paved with small gestures. Often framed via KPIs – key performance indicators, these transitional steps form the core categories contextualizing business data. But what 

    data matters?

    In the age of big data, businesses are producing larger amounts of information than ever before and there needs to be efficient ways to categorize and interpret that data. That’s where AI comes in.

    Building Data Categories

    One of the longstanding challenges with KPI development is that there are countless divisions any given business can use. Some focus on website traffic while others are concerned with social media engagement, but the most important thing is to focus on real actions and not vanity measures. Even if it’s just the first step toward a sale, your KPIs should reflect value for your bottom line.

     

    Small But Powerful

    KPIs typically cover a variety of similar actions – all Facebook behaviors or all inbound traffic, for example. The alternative, though, is to break down KPI-type behaviors into something known as micro conversions. 

    Micro conversions are simple behaviors that signal movement toward an ultimate goal like completing a sale, but carefully gathering data from micro conversions and tracking them can also help identify friction points and other barriers to conversion. This is especially true any time your business undergoes a redesign or institutes a new strategy. Comparing micro data points from the different phases, then, is a high value means of assessment.

    AI Interpretation

    Without AI, this micro data would be burdensome to manage – there’s just so much of it –but AI tools are both able to collect data and interpret it for application, particularly within comparative frameworks. All AI needs is well-developed KPIs.

    Business KPIs direct AI data collection, allow the system to identify shortfalls, and highlight performance goals that are being met, but it’s important to remember that AI tools can’t fix broader strategic or design problems. With the rise of machine learning, some businesses have come to believe that AI can solve any problem, but what it really does it clarify the data at every level, allowing your business to jump into action.

    Micro Mapping

    Perhaps the easiest way to describe what AI does in the age of big data is with a comparison. Your business is a continent and AI is the cartographer that offers you a map of everything within your business’s boundaries. Every topographical detail and landmark is noted. But the cartographer isn’t planning a trip or analyzing the political situation of your country. That’s up to someone else. In your business, that translates to the marketing department, your UI/UX experts, or C-suite executives. They solve problems by drawing on the map.

    Unprocessed big data is overwhelming – think millions of grains of sand that don’t mean anything on their own. AI processes that data into something useful, something with strategic value. Depending on your KPI, AI can even draw a path through the data, highlighting common routes from entry to conversion, where customers get lost – what you might consider friction points, and where they engage. When you begin to see data in this way, it becomes clear that it’s a world unto itself and one that has been fundamentally incomprehensible to users. 

    Even older CRM and analytics programs fall short when it comes to seeing the big picture and that’s why data management has changed so much in recent years. Suddenly, we have the technology to identify more than click-through-rates or page likes. AI fueled by big data is a new organization era with an emphasis on action. If you’re willing to follow the data, AI will draw you the map

     

    Author: Lary Alton

    Source: Information Management

  • Predictive modelling in Market Intelligence is hot

    IRCMSTR14533 Global Predictive Analytics Market 500x457

    Market intelligence is nog steeds een functie in bedrijven die onderbelicht is. Hoe vaak hebben bedrijven accuraat en actueel in beeld hoe groot hun markt precies is? En of deze groeit of krimp vertoont?

    B2C bedrijven kunnen tegen aanzienlijke bedragen nog dure rapporten kopen bij de informatiemakelaars van deze wereld. En als ze dan het geluk hebben dat voor hen relevante segmentaties zijn gebruikt kan dat inderdaad wat opleveren. B2B bedrijven hebben een veel grotere uitdaging. Markt data is doorgaans niet commercieel beschikbaar en zal moeten worden geproduceerd (al dan niet met behulp van B2C data). Waarmee markt data voor deze bedrijven eigenlijk nog duurder wordt.

    Bovenstaande discussie gaat bovendien nog slechts om data over de marktomvang en –waarde. De basis zou je kunnen zeggen. Data over concurrenten, marktaandelen, productontwikkelingen en marktbepalende trends is minstens zo relevant om een goede koers te kunnen bepalen maar ook tactische (inkoop, pricing, distributie) beslissingen te kunnen nemen.

    Toch zijn er mogelijkheden! Ook met behulp van schaarse data is het mogelijk marktdata te gaan reconstrueren. Het uitgangspunt: Als we op zoek gaan in die markten waar we wel data hebben naar voorspellende variabelen dan kunnen andere marktdata wellicht worden ‘benaderd’ of ‘geschat’. Een vorm van statistische reconstructie van marktdata die vaak betrouwbaarder blijkt dat dan die van surveys of expert panels. Meer en meer wordt deze techniek toegepast in market intelligence. Dus ook in dit vakgebied doet data science haar intrede.

    Als dit gemeengoed is, is de stap naar het voorspellen van markten natuurlijk niet ver meer weg. Meer en meer wordt die vraag natuurlijk gesteld. Kunnen we ook in kaart brengen hoe de markt er over 5 of misschien zelfs 10 jaar uitziet? Dit kan! En de kwaliteit van die voorspellingen neemt toe. En daarmee het gebruik. Market intelligence wordt er alleen maar leuker van! En het spel om de knikkers natuurlijk alleen maar interessanter.

    Source: Hammer, market intelligence

    http://www.hammer-intel.com

     

     

  • Recommending with SUCCES as a data scientist

    Recommending with SUCCES as a data scientist

    Have you ever walked an audience through your recommendations only to have them go nowhere? If you’re like most data scientists, chances are that you’ve been in this situation before.

    Part of the work of a data scientist is being able to translate your work into actionable recommendations and insights for stakeholders. This means making your ideas memorable, easy to understand and impactful.

    In this article, we’ll explore the principles behind the book 'Made To Stick' by Chip Heath and Dan Heath, and apply it within the context of data science. This book suggests that the best ideas follow six main principles: Simplicity, Unexpectedness, Concreteness, Credibility, Emotions, and Stories (SUCCES). After reading this article, you’ll be able to integrate these principles into your work and increase the impact of your recommendations and insights.

    Simple

    Making an idea simple is all about stripping the idea to its core. It’s not about dumbing down, but about creating something elegant. This means that you should avoid overwhelming your audience with ideas. When you try to say too many things, you don’t say anything at all. Another key component to making ideas simple is to avoid burying the lead. If during your analysis you find that 10% of customers contribute to 80% of revenues, lead with that key insight! You should follow an inverted pyramid approach where the first few minutes convey the most information, and as you get further you can get more nuanced. Analogies and metaphors are also a great way to get your ideas across simply and succinctly. Being able to use schemas that your audience can understand and relate to, will make it a lot more digestible. For example, a one-sentence analogy like Uber for X can capture the core message of what you’re trying to convey.

    Unexpected

    An unexpected idea is one that violates people’s expectations and takes advantage of surprise. You can do this in several ways, one of which is making people commit to an answer, then falsifying it. For example, asking to guess how much time employees spend doing a task you’re looking to automate before revealing the real answer. Another way to generate interest and leverage the unexpected principle is to use mysteries since they lead to aha moments. This might take the form of starting your presentation with a short story that you don’t resolve until the end, for example.

    Concrete

    Abstractness is the enemy of understanding for non-experts. It’s your job as the data scientist to make your recommendations and insights more concrete. A key to understanding is using concrete images and explaining ideas in terms of human actions and senses. The natural enemy of concreteness is the curse of knowledge. As data scientists, we need to fight the urge to overwhelm our audiences with unnecessary technical information. For example, reporting on the Root Mean Squared Error of a model, may not be as helpful as breaking up the language into more concrete terms that anyone can understand.

    Credible

    Adding credibility to your recommendations can take three forms. The first is the most common one when we think of credibility, which is leveraging experts to back up claims or assertions. Another way is using anti-authorities who are real people with powerful stories. For example, if you’re talking about the dangers of smoking, the story of someone who suffers from lung cancer will be a lot more impactful than a sterile statistic. The third way of adding credibility to your story is by outsourcing the credibility of your point to your audience. This means creating a testable claim that the audience can try out. For example making the claim that customers from region X take up 80% more customer support time than any other region. In posing this claim, your audience can confirm this claim which can make it easier to lead to your recommendation.

    Emotions

    Weaving an emotional component to your ideas is all about getting people to care. Humans are naturally wired to feel for humans, not for abstractions. As a result, one individual often trumps a composite statistic. Another component of emotions is tapping into the group identities that your audience conforms to. By keeping those identities in mind, you can tie in the relevant associations and evoke certain schemas that your audience will be most receptive to. For example, if you know one of your audience members is a stickler for numbers and wants to see a detailed breakdown of how you arrived at certain conclusions, adding an appendix may be helpful.

    Stories

    Humans have been telling stories for centuries and they have proven to be one of the most effective teaching methods. If you reflect on the books you’ve read in the past 5 years, you’re more likely to remember the interesting stories rather than objective facts. When weaving stories into your recommendations, make sure to build tension and don’t give everything away all at once. Another useful tactic is telling stories which act as springboards to other ideas. Creating open-ended stories that your audience can build on is a great way for them to get a sense of ownership.

    Next time you’re tasked with distilling your insights or pitching recommendations, keep in mind these six principles and you’ll be creating simple, unexpected, concrete, credentialed emotional stories in no time!

    Author: Andrei Lyskov

    Source: Towards Data Science

  • Samsung plans to open AI center in Cambridge

    samsung logoSamsung Electronics Co. Ltd., the Korean-based electronics giant, will open a new artificial-intelligence center in Cambridge, England, as the company seeks to benefit from cutting-edge academic research into the technology.

    Andrew Blake, a pioneering researcher in the development of systems that enable computers to interpret visual data, and a former director of Microsoft Corp.’s Cambridge Research Lab, will head the new Samsung AI center, the company said Tuesday.

    The center may hire as many as 150 AI experts, bringing the total number of people Samsung has working on research and development in the U.K. to 400 "in the near future," the company said.

    U.K. Prime Minister Theresa May said Samsung’s new lab would create high-paying, high-skilled jobs. "It is a vote of confidence in the U.K. as a world leader in artificial-intelligence," she said. 

    Samsung said it selected Cambridge because the University of Cambridge is world-renowned for its work on machine-learning and because the city already had a number of other prominent AI research labs, including Microsoft’s.

    Blake said the new Cambridge lab would focus on areas such as getting computers to recognize human emotions and ways to improve how people communicate and interact with increasingly intelligent machines.

    Hyun-suk Kim, Samsung’s chief executive officer, said the company would be looking at uses of AI that help provide users of devices, such as the Korean manufacturer’s phones, with more personalized services that better understood human behavior.

    Samsung joins a number of technology companies ramping up research into artificial-intelligence around the globe. Facebook Inc. announced the opening of two new AI labs, in Pittsburgh and Seattle, earlier this month. DeepMind, the London-based artificial intelligence company owned by Alphabet Inc., announced the opening of a new lab in Paris in March and last year expanded in Montreal and Edmonton, Alberta, in Canada.

    But new corporate research labs have often poached top academic computer scientists, luring them with pay packages that sometimes reach into seven figures, raising fears about a brain drain that may ultimately undercut the training of the next generation of scientists. In one of the most infamous examples, Uber hired 40 researchers and engineers away from Carnegie Mellon University’s robotics lab in Pittsburgh to staff its own self-driving car effort.

    In some cases, companies have tried to allay such fears by emphasizing that top academic hires will maintain a university affiliation or continue to have some role supervising students and teaching. Samsung said Blake will continue to be affiliated with the University of Cambridge and supervise PhD students despite his appointment.

     

    Source: Bloomberg

  • SAS Academy for Data Science in september van start in Nederland

    downloadVoor toekomstige en praktiserende data scientists zijn er weinig mogelijkheden om officiële papieren te halen voor hun werkveld. SAS introduceert daarom de SAS Academy for Data Science. Voor Europese deelnemers gaat deze opleiding in september van start in Nederland. In het programma van de SAS Academy for Data Science wordt kennisontwikkeling voor technologieën als big data, advanced analytics en machine learning gecombineerd met essentiële communicatieve vaardigheden voor data scientists.

    “De sleutel om concurrentievoordeel te behalen uit de enorme hoeveelheden data zijn analytics en de mensen die ermee kunnen werken”, vertelt Pascal Lubbe, Manager Education bij SAS. “De Academy for Data Science biedt kansen aan professionals die starten op dit gebied of hun capaciteiten verder willen ontwikkelen. Ook kunnen bedrijven een speciaal in-house programma laten ontwikkelen voor hun medewerkers. De studenten werken voor de opleiding met SAS-software, maar zijn bij het afronden van de opleiding breed gekwalificeerd als data scientist.”

    De tracks van de SAS Academy for Data Science bestaan uit verschillende elementen; een klassikale instructie, een hands-on case of team project, certificeringsexamens en coaching. Iedere track neemt zes weken in beslag. Door de examens succesvol af te leggen kunnen studenten een of twee diploma’s behalen: SAS Certified Big Data Professional en/of SAS Certified Data Scientist.

    Krachtige mix

    De SAS Academy for Data Science onderscheidt zich door de krachtige mix van praktische ervaring met analytics, computing, statistics en zakelijke kennis en presentatievaardigheden. De lessen worden geleid door experts, begeleid door een coach en studenten krijgen de beschikking tot de SAS-omgeving.

    De opleiding kent twee levels: in het eerste level worden studenten opgeleid om de ‘SAS Certified Big Data Professional credential’ te behalen. Ze leren hoe ze big data kunnen beheren en opschonen en de data te visualiseren met SAS en Hadoop. Level 2 is de opleiding tot gecertificeerd SAS Data Scientist, met predictive modeling, machine learning, segmentatie en text analytics. Ook wordt ingegaan hoe SAS samenwerkt met open source programmeertalen. En minstens zo belangrijk: studenten leren hoe ze met onmisbare communicatieve capaciteiten betekenis geven aan data voor stakeholders.

    Analytics-talent

    “SAS is bijna 40 jaar actief in het data science-vakgebied waarbij we telkens hebben ingespeeld op de behoeften van klanten. Nu vragen onze klanten om analytics-talent”, zegt Jim Goodnight, CEO van SAS. “Werkgevers vertrouwen gecertificeerde SAS-professionals niet alleen voor het beheren en analyseren van de data, maar ook om de betekenis en gevolgen voor de business te begrijpen. Door de analyseresultaten duidelijk te communiceren kunnen betere beslissingen genomen worden.”

    Source: Emerce

  • SAS Data Science en Analytics Day coming up!

    Artificial Intelligence (AI) is een onderwerp dat bij veel bedrijven op de agenda staat, maar concrete toepassingen staan vaak nog in de kinderschoenen. Tijdens de sessie 'Innovate with Analytics' zijn veel verschillende toepassingen van Artificial Intelligenc in de praktijk voorbijgekomen. Het werd zelfs duidelijk dat AI grote relevantie kan hebben voor de edele voetbalsport. Al doet bijgaande illustratie vermoeden dat de Nederlandse voetbalgoeroe Cruijff daar zo zijn vragen bij heeft. Bent u nieuwsgierig naar nog meer toepassingen? kom dan op 31 mei naar de SAS Data Science & Analytics Day en hoor alles over de laatste ontwikkelingen en trends op het gebied van data science en artificial intelligence.

    Als we het over AI hebben, dan is het belangrijk om eerst de definitie scherp voor ogen te hebben, stelt Mark Bakker, data strategist bij SAS. Dikwijls worden afbeeldingen van menselijk ogende robots of de Hollywood-klassieker Terminator gebruikt om AI toe te lichten. Terwijl dit deterministische beeld helemaal niet past bij de AI-toepassingen die tegenwoordig worden ingezet om betere bedrijfsresultaten te behalen. Volgens Bakker is AI: 'the science of training systems to emulate human tasks through learning and automation'. AI is dus geen zelfregulerende robot, maar een hulpmiddel voor menselijk handelen. Juist als de machine het werk beter kan analyseren of uitvoeren dan een mens.

    Natural language & image modelling

    Bakker en veel van zijn collega's proberen machines op een menselijke manier te laten communiceren. Hiervoor leren ze de machine om beelden, geluiden en tekst te begrijpen. Een interessant aandachtsgebied hierbij is de interpretatie van 'natural language'. Voor een machine is tekst altijd ongestructureerde data, terwijl mensen door het toevoegen van leestekens zorgen voor de juiste interpretatie bij de partij waarmee ze willen communiceren. Een machine zou volgens Bakker moeite hebben met de interpretatie van de zin: 'I am really happy, not'. Terwijl een menselijke lezer hierbij snel zal begrijpen dat de schrijver van dit bericht niet gelukkig is.

    Bakker laat ook zien hoe SAS een machine aanleert een kat te herkennen op een afbeelding met 'image modelling'. Op de foto is de schaduw van een kat op de muur te zien. De machine kan in eerste instantie bijvoorbeeld denken dat het om een deurmat gaat. Naast stilstaande beelden is het ook mogelijk om emotie te meten in videobeelden. Via AI bepaalt de machine hoeveel procent positieve emotie waar te nemen is. Natuurlijk, zo beaamt Bakker, is het heel interessant om dieper in dit soort ontwikkelingen te duiken, maar het wordt pas echt bruikbaar als je de opgedane inzichten ook kunt toepassen. Een mooi voorbeeld hiervan is de 'hard hat test'. Door middel van het analyseren van live-videobeelden kan een bedrijf controleren of een medewerker zijn valhelm op heeft. Op deze manier garandeert de organisatie dat er aan de veiligheidseisen op de werkplaats wordt voldaan.

    BallJames

    Een van de sportiefste toepassingen van AI is BallJames. Deze oplossing van het Nederlandse bedrijf SciSports heeft als doel 'to give AI back to football clubs'. De wereldvoetbalbond FIFA verbiedt het gebruik van sensoren op het veld, spelers of bal tijdens wedstrijden. Omdat er bij clubs behoefte is aan accurate, 3D-data registreren veertien camera's voor BallJames alle handelingen van de spelers op verschillende plekken van het voetbalveld. Het bijzondere aan BallJames is dat het een zelflerende oplossing is. Met deep learning algoritmes is het mogelijk om allerlei statistieken te genereren voor zowel de spelers, coaches als scouts. In de Eredivisie is Heracles Almelo de eerste club waar de camera's langs het veld alle activiteiten monitoren.

    The Edge

    De camera's van BallJames genereren per wedstrijd zo'n 1,4 terabyte data per stuk. Door de grote hoeveelheid informatie is het belangrijk direct te weten welke onderdelen je nodig hebt voor de analyse, zodat niet alle verzamelde data verstuurd en bewaard hoeft te worden. Tegenwoordig kan dit zo dicht mogelijk bij de bron, of 'on the edge'. In het voorbeeld van BallJames is de camera 'on the edge', de camera verzamelt en analyseert de dat

    a. Maar ook een sensor op de wiek van een windmolen of een termometer in een afgesloten zeecontainer kan dit beginpunt zijn. Het analyseren van data 'on the edge' heeft veel voordelen, legt Joao Oliveira, Principal Business Solutions Manager Information Management, uit. Zo stelt het bedrijven in staat om proactief te handelen bij bepaalde situaties. Een laser-camera in een winkel kan van elke klant een 'avatar' maken. Bij verdacht gedrag wil de winkelier geen seintje krijgen, maar automatisch de deuren sluiten zodat hij niet achter een mogelijke dief aan hoeft te rennen.

    Direct deployment

    Het is mogelijk om de data direct om te zetten in acties: 'direct deployment on the edge'. Om een 

    bepaalde activiteit of analyse uit te voeren hoeft de data dus niet eerst naar de cloud gestuurd te worden. Dit scheelt organisaties tijd en geld, zegt Oliveira. Stel dat je via analytics in de cloud achterhaalt dat een model van bijvoorbeeld een windmolen-sensor moet worden aangepast, dan zorgt dit voor een tijdelijke stop of vertraging van de processen van de gehele windmolen. Bovendien wil je dit voor een enkele windmolenwiek of een onderdeel daarvan kunnen aanpassen, zonder meerdere windmolens in het park stil te hoeven zetten. Een ander voorbeeld dat Oliveira geeft zijn de zogeheten 'smart containers' op zeeschepen, waarbij de temperatuur in de container real-time gemeten wordt. Tijdens lange zeeroutes is het niet altijd mogelijk - of heel kostbaar - om data naar de cloud te versturen, terwijl het automatisch aanzetten van de airconditioning kan zorgen voor minder bederf van goederen in de container.

    Bron: Analytics today

    https://www.analyticstoday.nl/blog/ai-in-de-praktijk-van-voetbal-data-tot-slimme-sluizen/?utm_source=ATnieuwsbrief2018-week11utm_medium=email&utm_campaign=ATnieuwsbrief2018-week11

     

  • The 4 major cybersecurity threats to business intelligence

    The 4 major cybersecurity threats to business intelligence

    Everywhere a business looks, there are risks, pitfalls, threats, and potential problems. We live in a world where there’s very little separation between the physical and the digital. While this may be beneficial in some ways, it’s problematic in others. When it comes to cybersecurity, businesses have to account for an array of technical and intensive challenges protecting their intelligence.

    4 Major cybersecurity threats

    For better or worse, cloud computing, the internet of things (IoT), artificial intelligence (AI), and machine learning have converged to create a connected environment that businesses must access without exposing themselves to hackers, cyber criminals, and other individuals and groups with unsavory intents. In 2019, it’s the following issues that are most pertinent and pressing:

    1. The rise of cryptojacking

    As the swift and malicious rise of ransomware has shown, criminal organizations will go to any lengths to employ malware and profit. This year, cryptojacking is a major topic.

    “Cryptojacking, otherwise known as “cryptomining malware”, uses both invasive methods of initial access and drive-by scripts on websites to steal resources from unsuspecting victims,” according to SecurityMagazine.comc. “Cryptojacking is a quieter, more insidious means of profit affecting endpoints, mobile devices, and servers: it runs in the background, quietly stealing spare machine resources to make greater profits for less risk.”

    2. Lack of confidence in the marketplace

    There’s a widespread lack of confidence in cybersecurity among customers and consumers in the marketplace. This limits many of the opportunities businesses have to implement much-needed change.

    This lack of confidence stems from highly publicized data breaches and cybersecurity issues. Take the US presidential election in 2016, for example. Despite that no proof of election tampering has been found, the media has led people to believe that there was some sort of breach. In the process, the notion of online voting seems unsafe, despite the fact that it’s something we need.

    In the context of business, every time there’s a major data breach, like Target or Experian, consumers lose trust in the ability of companies to protect their data. (Despite the fact that thousands of companies protect billions of pieces of data on a daily basis.)

    The challenge moving forward will be for individual businesses to practice data integrity and promote the right cyber security policies to rebuild trust and gain confidence from their customers.

    3. Supply chain attacks

    As businesses continue to build up their defenses around key aspects of their businesses, cyber criminals are looking for a softer underbelly that’s less fortified. Many of these attackers are finding it in vulnerable supply chains where risks aren’t completely understood (and where there has to be better cooperation between partners who rarely care to be on the same page).

    As we move through 2019, businesses would do well to consider what sensitive information they share with vendors. It’s equally important to consider the risk level of each vendor and which ones are worth working with.

    4. Insider threats

    According to a recent survey by Bricata on the top network security challenges facing businesses in 2019, 44% of respondents identified insider threats as an issue. (The next closest threat was IT infrastructure complexity at 42%.)

    In the context of this survey, insider threats aren’t necessarily malicious actions from employees. Instead, it’s often the result of accidental incidents and well-intended actions that go wrong. Businesses can counteract some of these insider threats by using tools like SAP Cloud Identity Access Governance, which allows businesses to use real-time visualizations to monitor and optimize employee access to data and applications.

    Better employee education and training is also a wise investment. Far too many employees remain unaware of the risks facing their employers, continuing to make foolish mistakes without realizing they’re making them.

    Moving toward a safer digital future

    While some would say we’re already living in the future, it’s important for business leaders to remain cognizant of what’s coming down the innovation pipeline so that the right strategic initiatives can be put into place. In doing so, we can all bask in the optimism of a brighter, safer digital future.

    Author: Anna Johansson

    Source: SAP

  • The 5G revolution: Perks and early uses for your business

    The 5G revolution: Perks and early uses for your business

    The much-awaited revolution is nearly here

    While the 5G PR machinery has been on a roll for a while now, there’s finally a clear picture of when exactly we can expect the technology to hit our shores: come March 2020. And India will be witness to the latest advancements in cellular technology, entailing some very impressive features and benefits.

    As the world has moved from connected devices to cars and further on to factories, the need for enhanced communication technology has amplified big time. The introduction of 5G, which will offer blazing data speeds, minimized latency and higher system capacity at reduced costs, will open up an unprecedented range of applications.

    Why’s there so much hype this time around? What’s the big difference?

    Though you may be forgiven for thinking that 5G is just another generation of communication technology, the truth is that it will usher in a world of difference in terms of performance and operations. Let’s take a quick look at the highlights of what the previous generations of technology have offered before we compare just the last two:

    • 1G – Voice calls
    • 2G – Voice calls + messaging
    • 3G – Voice calls + messaging + multimedia & internet data services
    • 4G LTE – High speed (functionalities mostly the same)

    Now, it’s from 4G to 5G that there is the biggest leap yet, both in terms of performance and the way the whole setup operates. Here’s a summarised comparison of the major metrics in both the bands:

    4G5Gtabel

    What’s interesting to note is that while 5G can support a much higher device density at blazing fast speeds when compared to 4G, it requires a tower every 300 meters. In other words, its range is considerably shorter than that of 4G due to its usage of high frequency signals, which cannot travel very long distances.

    Yet, the promise of 5G is immense. Just the fact that it embodies three features, viz, unparalleled speed, device coverage density and low latency alone makes it ideal for fundamentally new use cases in applications and business models across a variety of industries including retail, transport, government and entertainment. Let’s take a look at some of these.

    The perks

    Some of the most progressive advancements in digital transformation are yet to see the light of day for want of 5G technology to ensure feasibility as well as scalability. A few of the major applications urgently awaiting the breakthrough are listed below:

    • Self-driving, connected cars: The extremely low latency of 5G networks will be able to facilitate decisions for autonomous vehicles on a real-time basis. This in turn can help reduce and eventually eliminate road accidents, leading to safer, smoother traffic – a major necessity in today’s increasingly populated and frenzied times.
    • IoT (Internet of Things)networks: With the number of IoT-enabled devices set to explode in just about a year (86 billion sensors and devices will be deployed in the consumer segment alone by 2020, according to Forbes estimates), faster, more streamlined communication among the growing number of devices is rapidly turning from a ‘nice-to-have’ into a ‘need-to-have’. The only technology capable of supporting this proliferation of devices is 5G, which will therefore play a critical role in the continued development of smart factories and manufacturing processes.
    • Real-time robotic surgeries: The exceptional speed and low latency of 5G will, for the first time, make it possible to support remote execution of surgeries, with doctors and patients literally located on opposite sides of the world – and artificial intelligence (AI) and machine learning (ML)enabled robots carrying out the necessary procedures.
    • Smart watches, VR, AR and drones: A large number of relatively new inventions that require constant interactions with the environment/ other devices to enable accurate, seamless operations have been unable to witness full-fledged rollouts due to a lack of the speed and bandwidth afforded by 5G.

    Early uses: proactively eliminating known and latent risks through anomaly detection on 5G

    It’s inevitable that with the emergence of new inventions and technologies, there will arise new challenges. However, taking appropriate counter measures can help business overcome these challenges, effectively mitigating the associated risks involved.

    Leverage a combination of statistical methods and AI and ML-based algorithms to detect anomalies in your data and alert you in real time, so that you can take preventive action to avert business-critical issues and leverage profit-generating opportunities. A few of our use cases are explained below:

    • Network operations: With increased speeds, networks become more vulnerable to hackers, viruses or other malicious threats that can halt operations for anything ranging from a few minutes to a number of hours. Anomaly detection can proactively identify such threats before they happen, and immediately alert telecom providers to take appropriate action and manage the threat.
    • Customer experience: With towers placed at distances of 300 meters each, telecom operators are required to keep a close tab on a large number of assets to ensure high network connectivity at all times, as even the slightest drop in connectivity can lead to dissatisfied users. Constantly monitors upload and download speeds and immediately alerts operators in case of a significant decrease in either case, so that they can deploy resources as quickly as possible to resolve the glitch.
    • Infrastructure management: The introduction of 5G entails the use of considerable new infrastructure in the form of hardware, software and algorithms to enable 5G, for instance, mmWaves (millimeter waves), multi user massive MIMO (Multi Input Multi Output), beamforming, small cells and full duplex. These functions like the links of a long chain. Now, it’s a well-known fact that the more the number of links in a chain, the higher its vulnerability. With an automated anomaly detection, any long-term or short-term breakdowns can be taken care of by instantaneously alerting users when anomalies are detected during routine monitoring of metrics.

    How well protected are you against future, apparently “unpredictable” incidents? Are you willing to leave business stability and continuity to chance? Find out what you can do to arrive at favorable answers to both these questions with anomaly detection on 5G.

    Source: Insidebigdata

  • The rise in importance of data and its consequences for retailers

    Storefronts and stalls have been around since forever, and even now, one just has to walk around an old city centre to see stores that are hundreds of years old. Take Fortnum and Mason in London for example, founded in 1707 and still trading in the same building. Retail has not fundamentally changed much over the millennia.

    However, since the emergence of the internet since the mid 90’s and the introduction of e-commerce stores such as Amazon and eBay, the retail industry has undergone its biggest overhaul yet. The physical world of try before you buy is rapidly decreasing, while the virtual world of buy, deliver and then send it back if you don’t like it is growing.

    Amazon has recently posted an impressive annual revenue of $232.9 billion. Likewise, the average growth of sales of e-commerce was 24.5% annually since 2014.

    According to the analysis of 500 high streets by accountancy specialists, 2,692 stores were shut between January and June. Just 1,569 started up – an all-time low, because of plunging confidence. Retailers are focusing more and more on online shopping.

    Gone are the days where buses, trains and cars shipped shoppers to the highstreets and out of town shopping centres. Nowadays all kinds of delivery vehicles bring the products of the shops to us.

    Useful data to retailers

    As retail businesses are closing down their storefronts on the high street, they are focusing more and more on online sales. With this change, an influx of useful data to retailers emerges that must be stored and made good use of.

    This flood of data to retailers presents them with a huge amount of power and opportunity to know who their customers are and likewise to gather data about them in order to inform future business decisions, but it also presents a big challenge. Whichever retail business falls behind in managing data will be beaten by its competition relentlessly.

    When customers make an online order, they must fill out all kinds personal details like the delivery address. With this data, company’s can map where the majority of their customers are ordering from and optimize marketing campaigns accordingly based on location.

    Perhaps a business notices that a lot of its sales are in inner city areas, they may choose to double down their marketing efforts in these areas or look for other cities in which their products may be suitable.

    Likewise, knowing where your customers are based may help to optimize delivery routes so that customers get their products as quickly and efficiently as possible. It is important to be aware that these benefits are to no avail if customer data is incorrect or there are large amounts of duplicates in retailers CRM systems. Data sets that are either obtained incorrectly or managed the wrong way will lead to misinformed decisions, resulting into inefficient business processes and ultimately losses for the business. It is essential that with this uprooting of the retail industry, retailers start taking data management seriously.

    Whilst there is a large opportunity for retailers who take their data management seriously, there is also a major risk if customer data is not managed correctly. The sanctions on companies who mismanage their customer data can be substantial. Likewise, in an increasingly competitive market, missing out on the benefits of effective data management may be the angel of death for retailers around the globe. In the modern society of online shopping, data management can make or break the success of a retail business.

    Change is inevitable, however, it is businesses resilience and the ability to adapt to change that determines the ability not only to survive but to prosper.

    Author: Martin Doyle

    Source: DQ Global

  • The skillset of the modern day data scientist

    It’s no secret that data scientists still have one of the best jobs in the world. With the amount of data growing exponentially, the profession shows no signs of slowing down. Data scientists have a base salary of $117,000 and high job satisfaction levels, and according to the LinkedIn Workforce Report, there is a shortage of 151,717 data scientists in the U.S. alone. One explanation for this shortage is that the data scientist role is a relatively new one and there just aren’t enough trained data scientists. That’s why 365 Data Science set out to discover what makes the "typical" data scientist, aiming to dismantle the myths surrounding the role and to inspire some of you to enter the field when you otherwise may have felt like you wouldn’t fit the criteria.

    About the study

     1,001 LinkedIn profiles of people currently employed as data scientists were examined, and that data was then collated and analyzed. Forty percent of the professionals in the sample were employed at Fortune 500 companies and 60% were not; in addition, location quotas were introduced to ensure limited bias: U.S. (40%), UK (30%), India (15%) and other countries (15%). The selection was based on preliminary research on the most popular countries for data science where information is public. The first instance of this study was carried out in 2018, when it became clear that data scientists have a wide range of skills and come from an assortment of backgrounds. You can see what skills the typical data scientist used to have XXX The tech industry and business needs are constantly changing entities; therefore, data scientists must change with it. That’s why we decided to replicate the study with new data for 2019. Of course, there were plenty of insights;— which we will discuss in depth— but first, let's a take a quick look at an overview of the typical data scientist. At first glance, we see that the data science field is dominated by men (69%) who are bilingual, and they prefer to program in Python or R;(73%); they have worked for an average of 2.3 years as data scientists and hold a master’s degree or higher (74%). But is this what you must embody to make it as a data scientist? Absolutely not! As we segment the data, we get a much clearer view.

    Does where you went to university make a difference?

     In a profession with a six-figure salary and fantastic growth prospects, you wouldn’t be blamed for thinking that Harvard or Oxford graduates' résumés are the ones that find their way to the top of the pile on the desk of any hiring manager. But that’s not the only conclusion we can draw. It was found that a large part of our cohort attended a Top 50 university (31%). The Times Higher Education World University Ranking for 2019 helped to estimate this. But before you lose hope, notice that the second largest subset of data scientists is comprised of graduates of universities ranked above 1,001 or not ranked at all (24%). That said, it seems that attending a prestigious university does give you an advantage — hardly a surprise — but data science is far from being an exclusive Ivy League club. In fact, data science is a near-even playing field, no matter which university you graduated from. So, the data shows that a university’s rank doesn’t greatly influence your chances of becoming a data scientist. But what about your chances of getting hired by a company of specific size? Does a university’s reputation play a role there? Let’s find out!

    Are employers interested in where you went to university?

     Great news! The tier of university a data scientist attends seem to have close to no effect on his or her ability to find employment at companies of different sizes. Data scientists are valued everywhere, from Fortune 500 companies to tech start-ups. This reinforces the idea that a data scientist is judged by professional skills and level of self-preparation.That said, almost half of the cohort earned at least one online certificate (43%), with three being the average number of certificates per person. It’s worth mentioning; however, these numbers might be higher in reality — many people do not list certificates that are no longer relevant, even if they would have been beneficial at some point. Think how unlikely it would be for an experienced senior data scientist to boast about a certificate in the fundamentals of Python. Self-preparation is a huge factor in gaining employment, but is there any correlation between the rank of the university you graduated from and whether you took online courses?

    Which graduates are more likely to take online courses?

     The assumption was that only be students from lower-ranking universities had to boost their résumés with skills from online courses. But the data tells a different story. The Top 500 ranked universities are split between five clusters. These show similar numbers of graduates who have taken online courses: 39%, 38%, 40%, 39%, and 42%. These percentages are not far from the overall percentage of data scientists in the cohort who report earning a certificate (43%). The 501-1000 cluster does show 55%, which is somewhat higher and may support the notion that graduates from lower ranked universities need more courses. However, when we reach the "not ranked" cluster, the number (47%) is closer to the average. These results show that self-preparation is popular among graduates from all universities and incredibly valuable when preparing for a career in data science. Note: The 1,001+ cluster contains only seven people, which isn't a large enough sample to gain reliable insights. Therefore, these results will not be discussed.

    Conclusion

    If the results show us anything, it’s that the field of data science is fairly inclusive. As long as aspiring data scientists put in effort to develop their skills, they have a shot at success. While many top careers value a rigid (and sometimes elitist) path to success, data scientists are offered much more flexibility and freedom.

    Author: Iliya Valchanov

    Source: Oracle

  • The Top 5 Trends in Big Data for 2017

    Last year the big data market centered squarely on technology around the Hadoop ecosystem. Since then, it’s been all about ‘putting big data to work’ thro

    top 5ugh use cases shown to generate ROI from increased revenue and productivity and lower risk.

    Now, big data continues its march beyond the crater. Next year we can expect to see more mainstream companies adopting big data and IoT, with traditionally conservative and skeptic organizations starting to take the plunge.

    Data blending will be more important compared to a few years ago when we were just getting started with Hadoop. The combination of social data, mobile apps, CRM records and purchase histories via advanced analytics platforms allow marketers a glimpse into the future by bringing hidden patterns and valuable insights on current and future buying behaviors into light.

    The spread of self-service data analytics, along with widespread adoption of the cloud and Hadoop, are creating industry-wide change that businesses will either take advantage of or ignore at their peril. The reality is that the tools are still emerging, and the promise of the (Hadoop) platform is not at the level it needs to be for business to rely on it.

    As we move forward, there will be five key trends shaping the world of big -Data:

    The Internet of Things (IoT)

    Businesses are increasingly looking to derive value from all data; large industrial companies that make, move, sell and support physical things are plugging sensors attached to their ‘things’ into the Internet. Organizations will have to adapt technologies to map with IoT data. This presents countless new challenges and opportunities in the areas of data governance, standards, health and safety, security and supply chain, to name a few.

    IoT and big data are two sides of the same coin; billions of internet-connected 'things' will generate massive amounts of data. However, that in itself won't usher in another industrial revolution, transform day-to-day digital living, or deliver a planet-saving early warning system. Data from outside the device is the way enterprises can differentiate themselves. Capturing and analyzing this type of data in context can unlock new possibilities for businesses.

    Research has indicated that predictive maintenance can generate savings of up to 12 percent over scheduled repairs, leading to a 30 percent reduction in maintenance costs and a 70 percent cut in downtime from equipment breakdowns. For a manufacturing plant or a transport company, achieving these results from data-driven decisions can add up to significant operational improvements and savings opportunities.

    Deep Learning

    Deep learning, a set of machine-learning techniques based on neural networking, is still evolving, but shows great potential for solving business problems. It enables computers to recognize items of interest in large quantities of unstructured and binary data, and to deduce relationships without needing specific models or programming instructions.

    These algorithms are largely motivated by the field of artificial intelligence, which has the general goal of emulating the human brain’s ability to observe, analyze, learn, and make decisions, especially for extremely complex problems. A key concept underlying deep learning methods is distributed representations of the data, in which a large number of possible configurations of the abstract features of the input data are feasible, allowing for a compact representation of each sample and leading to a richer generalization.

    Deep learning is primarily useful for learning from large amounts of unlabeled/unsupervised data, making it attractive for extracting meaningful representations and patterns from Big Data. For example, it could be used to recognize many different kinds of data, such as the shapes, colors and objects in a video — or even the presence of a cat within images, as a neural network built by Google famously did in 2012.

    As a result, the enterprise will likely see more attention placed on semi-supervised or unsupervised training algorithms to handle the large influx of data.

    In-Memory Analytics

    Unlike conventional business intelligence (BI) software that runs queries against data stored on server hard drives, in-memory technology queries information loaded into RAM, which can significantly accelerate analytical performance by reducing or even eliminating disk I/O bottlenecks. With big data, it is the availability of terabyte systems and massive parallel processing that makes in-memory more interesting.

    At this stage of the game, big data analytics is really about discovery. Running iterations to see correlations between data points doesn't happen without millisec

    onds of latency, multiplied by millions/billions of iterations. Working in memory is at three orders of magnitude faster than going to disk.

    In 2014, Gartner coined the term HTAP - Hybrid Transaction/Analytic Processing, to describe a new technology that allows transactions and analytic processing to reside in the same in-memory database. It allows application leaders to innovate via greater situation awareness and improved business agility, however entails an upheaval in the established architectures, technologies and skills driven by use of in-memory computing technologies as enablers.

    Many businesses are already leveraging hybrid transaction/analytical processing (HTAP); for example, retailers are able to quickly identify items that are trending as bestsellers within the past hour and immediately create customized offers for that item.

    But there’s a lot of hype around HTAP, and businesses have been overusing it. For systems where the user needs to see the same data in the same way many times during the day, and there’s no significant change in the data, in-memory is a waste of money. And while you can perform analytics faster with HTAP, all of the transactions must reside within the same database. The problem is, that most analytics efforts today are about putting transactions from many different systems together.

    It’s all on Cloud

    Hybrid and public cloud services continue to rise in popularity, with investors claiming their stakes. The key to big data success is in running the (Hadoop) platform on an elastic infrastructure.

    We will see the convergence of data storage and analytics, resulting in new smarter storage systems that will be optimized for storing, managing and sorting massive petabytes of data sets. Going forward, we can expect to see the cloud-based big data ecosystem continue its momentum in the overall market at more than just the “early adopter” margin.

    Companies want a platform that allows them to scale, something that cannot be delivered through a heavy investment on a data center that is frozen in time. For example, the Human Genome Project started as a gigabyte-scale project but quickly got into terabyte and petabyte scale. Some of the leading enterprises have already begun to split workloads in a bi-modal fashion and run some data workloads in the cloud. Many expect this to accelerate strongly as these solutions move further along the adoption cycle.

     

    There is a big emphasis on APIs to unlock data and capabilities in a reusable way, with many companies looking to run their APIs in the cloud and in the data center. On-premises APIs offer a seamless way to unlock legacy systems and connect them with cloud applications, which is crucial for businesses that want to make a cloud-first strategy a reality.

    More businesses will run their APIs in the cloud, providing elasticity to better cope with spikes in demand and make efficient connections, enabling them to adopt and innovate faster than competition.

    Apache Spark

    Apache Spark is lighting up big data. The popular Apache Spark project provides Spark Streaming to handle processing in near real time through a mostly in-memory, micro-batching approach. It has moved from being a component of the Hadoop ecosystem to the big data platform of choice for a number of enterprises.

    Now the largest big data open source project, Spark provides dramatically increased data processing speed compared to Hadoop, and as a result, is much more natural, mathematical, and convenient for programmers. It provides an efficient, general-purpose framework for parallel execution.

    Spark Streaming, which is the prime part of Spark, is used to stream large chunks of data with help from the core by breaking the large data into smaller packets and then transforming them, thereby accelerating the creation of the RDD. This is very useful in today’s world where data analysis often requires the resources of a fleet of machines working together.

    However, it’s important to note that Spark is meant to enhance, not replace, the Hadoop stack. In order to gain even greater value from big data, companies consider using Hadoop and Spark together for better analytics and storage capabilities.

    Increasingly sophisticated big data demands means the pressure to innovate will remain high. If they haven’t already, businesses will begin to see that cus

    tomer success is a data job. Companies that are not capitalizing on data analytics will start to go out of business, with successful enterprises realizing that the key to growth is data refinement and predictive analytics.

    Information Management, 2016; Brad Chivukala

  • Tips for Creating a Winning Data Scientist Team

    Finding the right mix of support to do more with your data is no easy task. Data scientist teamData scientists remain in high-demand, and fetch top dollar. Here are some tips on how to assemble a winning team.

    So much data, so little time

    Organizations continue to struggle with how to get more out of their data. “It’s not a new challenge, but the problem is only exacerbated as more data is exchanged and created at petabyte scale,” confirms Dermot O’Connor, cofounder and vice president at Boxever. “The proliferation of data and the pressure for organizations to turn data into business value has increased demand for data science professionals.” Approximately 10 percent of the workforce at Boxever is data scientists, and O’Connor shared his views on how to best assemble a data science team.

    Seeking the ‘total package’

    “When a company seeks to hire a data scientist, it's typically seeking someone with skills in advanced programming and statistical analysis, along with expertise in a particular industry segment,” O’Connor explains. “The need is great, and the skills gap is widening: A study by McKinsey predicts that ‘by 2018, the U.S. alone may face a 50 percent to 60 percent gap between supply and requisite demand of deep analytic talent.’ Good data scientists are often referred to as ‘unicorns’ because it is so rare to find professionals who possess all the right skills to meet today’s requirements.”

    Still the top job in America

    “As the ‘top job in America in 2016,’ data scientists don’t come cheap,” O'Connor confirms. “How can today’s organizations harness the brains behind data science to get the most out of their investment, whether in talent or technology? Here are some things to consider when building your data science team…”

    Data science is a team sport

    “There are many facets to creating successful data science teams in a practical, operational sense,” O’Connor says. “It’s rare to hire just one or two on staff, so remember that for data scientists as much as any other role, strength comes in numbers.”

    Outsource to innovate

    “If you do the math, a team of seasoned data scientists – let’s say only five – will cost you well over $1 million annually in fixed costs,” O’Connor notes. “And like many in IT functions, they’re likely to be pulled in many directions. Having a dedicated resource to optimize your systems with networks getting increasingly smarter with every interaction via machine learning is one way to ensure that projects are efficient while blending technology platform costs with the costs for data science talent that drives them.”

    Balance functional and strategic tasks

    “Part of the reason data scientists are so in demand is because they have concrete skills in predictive analytics that others – in IT and business roles – lack,” O’Connor explains. “That being said, you’ll need sufficient talent and resources to both write and maintain software and algorithms while also gathering insights from internal teams and customers to customize and optimize the logic behind them.”

    Set data scientists up for success with the right data management systems

    “High volume, omni-channel systems are very complex – and time consuming – to manage,” says O’Connor. “Having a hub where data at the individual customer level is aggregated helps set the foundation for data scientists to really shine. Finding ways to automate processes so that the right data is available on demand will make any data scientist’s life easier and will make more possible under their strategic guidance.”

    Expect to ‘see inside the black box’ of AI

    “A data scientist should be tasked with explaining the process of machine learning and artificial intel

    ligence in layman’s terms to bring in others into their realm throughout the enterprise,” O’Connor explains. “This is essential for gathering insights that make predictions stronger and actions more focused by design. And as marketers take on greater oversight of data, it’s important that CMOs and other decision-makers find complementary talent and technology to help them see the big picture to explore all that’s possible with their data.”

    Bron: Information Management, 2016

  • Top 10 big data predictions for 2019

    The amount of data that created nowadays is incredible. The amount and importance of data is ever growing, and with that the need for analyzing and identifying patterns and trends in data becomes critical for businesses. Therefore, the need for big data analytics is higher than ever. That raises questions about the future of big data. ‘In which direction will the big data industry evolve?’ 'What are the dominant trends for big data in the future?' While there are several predictions doing the rounds, these are the top 10 big data predictions that will most likely dominate the (near) future of the big data industry:

    1. An increased demand for data scientists

    It is clear that with the growth of data, the demand for people capable of managing big data is also growing. Demand for data scientists, analysts and data management experts is on the rise. The gap between the demand and availability of people who are skilled in analyzing big data trends is big and keeps getting bigger. It is up to you to decide if you wish to hire offshore data scientists/data managersor hire an in-house team for your business.

    2. Businesses will prefer algorithms over software

    Businesses prefer purchasing existing algorithms over creating their own. It gives them more customization options compared to a situation where they buy software. Software cannot be modified as per user requirements, rather businesses have to adjust as per the software.

    3. Businesses increase investments in big data

    IDC analysts predict that the investment in big data and analytics will reach $187 billion in 2019. Even though the big data investment from one industry to the other will vary, spending as a whole will increase. It is predicted that the manufacturing industry will experience the highest investment in big data, followed by healthcare and the financial industry.

    4. Data security and privacy will be a growing concern

    Data security and privacy have been the biggest challenges in the big data and internet of things (IoT) industries. Since the volume of data started increasing exponentially, the privacy and security of data have become more complex and the need to maintain high-security standards is becoming extremely important. If there is something that will impede the growth of big data, it is data security and privacyconcerns.

    5. Machine learning will be of more importance for big data

    Machine learning will be of paramount importance regarding big data. One of the most important reasons why machine learning will be important for big data is that it can be of huge help in predictive analysis and addressing future challenges.

    6. The rise of predictive analytics

    Simply put, predictive analytics can predict the future more reliably with the help of big data analytics. It is a highly sophisticated and effective way to gather market and customer information to determine the next actions of both consumer and businesses. Analytics provide depth in the understanding of futuristic behaviour.

    7. Chief Data Officers will have a more important role

    As big data becomes important, the role of Chief Data Officers will increase. Chief Data Officers will be able to direct functional departments with the power of deeply analysed data and in-depth studies of trends.

    8. Artificial Intelligence will become more accessible

    Without going in detail about how Artificial Intelligence becomes significantly important for every industry, it is safe to say that big data is a major enabler of AI. Processing large amounts of data to derive trends for AI and machine learning is possible. With cloud-based data storage infrastructure, parallel processing of big data is possible. Big data will make AI more productive and more efficient.

    9. A surge in IoT networks

    Smart devices are dominating our lives like never before. There will be an increase in the use of IoT by businesses and that will only increase the amount of data that is being generated. In fact, the focus will be on introducing new devices that are capable of collecting and processing data as quickly as possible.

    10. Chatbots will get smarter

    Needless to say, chatbots come across a large part of daily online interaction. But chatbots are turning more and more intelligent and capable of personalized interactions. With the rise of AI, big data will enable tons of data to be processed and conversations can be analysed to draw a more streamlined strategy that is more customer-focused for chatbots to be smarter.

    Is your business ready for the future of big data analytics? Keep the above predictions in mind when preparing your business for emerging technologies and think about how big data can play a role.

    Source: Datafloq

  • Top 4 e-mail tracking tools using big data

    Top 4 e-mail tracking tools using big data

    Big data is being incorporated in many aspects of e-mail marketing. It has made it surprisingly easy for organizations to track the performance of e-mail marketing campaigns in fascinating ways.

    How big data changes e-mail tracking

    No matter what your role is, if you work in the technology sector, you likely spend a large portion of your day dealing with e-mail in some way. You’re sending, reading, or reviewing e-mails, or you’re checking your inbox to see if anything else comes in. By some estimates, the average worker even spends 30 hours a week checking their e-mail.

    Despite being such a centrally important and frequent job function, most of us are flying blind. We don’t understand how much time we’re spending on e-mail, nor do we have a solid understanding of whether our efforts are productive. Fortunately, there are several new e-mail tracking software tools that employers and employees can use to keep a closer eye on these metrics.

    The problem is that previous e-mail monitoring tools lacked the analytics capabilities needed to make empirically based decisions with the quality managers needed. Big data is making it easier for companies to get deeper insights.

    Why use e-mail tracking software tools that rely on big data?

    There are many potential applications for e-mail tracking software tools, but these are some of the most important:

    • Productivity analytics. Studying how you e-mail can alert you to the nuances of your e-mail habits, including how often you send e-mail, how long it takes you to write and read e-mail, and what your busiest days and times are. You’ll learn what your worst habits are, so you can polish them and use your time more efficiently, and will learn to optimize your schedule to get more done each day.
    • Sales and response metrics. Many companies rely on sales or prospecting via e-mail, but if you aren’t gathering metrics like open rates and response rates, you may not be able to improve your process over time. e-mail tracking software can help you keep tabs on your progress, and may help you gather or organize information on your prospects at the same time.
    • Employee monitoring. Employees waste about 3 hours a day on unproductive activities, while most human resources departments only assume that 1 hour or less is wasted per day. Using some kind of e-mail tracking can help you measure your employees’ productivity, and help you balance workloads between multiple employees.

    Big data is at the root of all of these functions, and this makes it critical to control your data. It makes it easier for brands to get better insights.

    The best e-mail tracking software tools that leverage big data

    Some e-mail tracking tools focus exclusively on one e-mail function, like tracking sales or marketing campaigns. Others offer a more robust suite of features, allowing you to track your overall productivity.

    Whatever your goals are, these five tools are some of the best e-mail tracking apps you can get your hands on. They all rely on sophisticated big data analytics systems.

    1. EmailAnalytics

    First, we have EmailAnalytics, which can be thought of like Google Analytics for Gmail. This tool integrates with your Gmail or G Suite account and visualizes your e-mail activities into charts, graphs, and tables. It reports on metrics like average e-mail response time, e-mails sent, e-mails received, times and days of the week that are busiest for you, and how long your average e-mail threads tend to last. With the help of interactive data visuals and regular reports, you can quickly determine the weak points in your approach to e-mail (and resolve to fix them). The tool also enables managers to view reports for teams or employees, so you can monitor team e-mail productivity.

    2. Microsoft MyAnalytics

    Microsoft’s MyAnalytics isn’t quite as robust as EmailAnalytics, but it works quite well as a productivity tracker for Microsoft Outlook. With it, you can keep track of how you and your employees are spending the hours of your day, drawing in information from your e-mail inbox and calendar. If you’re spending too much time in meetings, or too much time on managing your inbox, you’ll be able to figure that out quickly, and starting making proactive changes to your scheduling and work habits.

    3. Streak

    Streak is another Gmail tool, and one that attempts to convert Gmail into a full-fledged CRM platform. With it, you can convert messages into leads and prospects across various pipelines, and track your progress with each new prospective sale. It also offers built-in collaboration tools, so your team can work together on a single project—and track each other’s efforts.

    4. Yesware

    Yesware is designed with salespeople and salesmanagers in mind, and it offers prescriptive sales analytics based on your e-mail activity. With it, you can track a number of metrics within your e-mail strategy, including open rates, click-through rates, and other forms of customer engagement. Over time, you’ll learn which strategies work best for your prospects, and can use those strategies to employ more effective sales techniques.

    Implementing these e-mail tracking software tools in your business can help you better understand how you and your employees are using e-mail, improve your sales process, and spend less time on this all-too-important communication medium. Just remember, while data visuals and reports can be helpful in improving your understanding, those insights are only truly valuable if you take action on them.

    Big data makes e-mail tracking more effective than ever

    Big data is changing the nature of e-mail marketing. Companies can use more nuanced data analytics capabilities to drive their decision-making models in fascinating ways.

    Author: Matt James

    Source: SmartDataCollective

  • Van Business Intelligence naar Data Science

    691283Organisaties die al jaren ervaring hebben met de inzet van datawarehouses en Business Intelligence gaan steeds vaker Data Science-toepassingen ontwikkelen. Dat is logisch, want data heeft een impact op iedere organisatie; van retailer, reisorganisatie en financiële instelling tot ziekenhuis. Er wordt zelfs beweerd dat we momenteel in een vierde industriële revolutie zijn aanbeland, waarbij data als productiefactor is toegevoegd aan het lijstje mensen, kapitaal en grondstoffen. Hoe verhouden BI en Data Science zich tot elkaar en op welke manier maak je als BI-organisatie de stap naar Data Science-toepassingen?


    Algoritmes en Data
    Big Data is in een aantal jaar razendsnel opgekomen. Inmiddels zijn we van de Big Data-hype terechtgekomen in een tijd waarin het juist gaat over het voorspellen, de tijd van Data Science, waarin machine learning, artificial intelligence en deep learning een steeds grotere rol spelen. We komen terecht in een wereld waarin singularity, het moment waarop systemen intelligenter zijn dan de mens, steeds dichterbij komt. Of we dit punt ooit zullen bereiken weet niemand, wat er zal gebeuren op dat moment is nog onzekerder. Maar wat wel een feit is, is dat de wereld om ons heen steeds meer gedomineerd wordt door algoritmes en data. 
    Hadoop heeft met zijn andere manier om data op te slaan en doorzoekbaar te maken een cruciale rol gespeeld in de Big Data-revolutie. Door de toegenomen rekenkracht en de afgenomen kosten van opslagcapaciteit is het tegenwoordig mogelijk om vrijwel onbeperkte hoeveelheden data op te slaan en beschikbaar te maken, waardoor data en technologie steeds minder een belemmering zijn voor innovatie.

    Data en Technologie
    Innoveren met data draait vanzelfsprekend om data en om technologie, maar deze komen steeds meer en gemakkelijker beschikbaar. Denk aan bijvoorbeeld de opkomst van open source technologie, waardoor je de technologie kan zoeken bij de toepassing. Dit was vroeger wel anders, toen waren het de grote organisaties die zich een licentie op dure software konden veroorloven om concurrentievoordeel mee op te bouwen. Open source is natuurlijk niet gratis, maar de kosten groeien lineair naarmate je een technologie meer gebruikt en niet zoals bij licensed producten, exponentieel.

    Verdwijnt Business Intelligence?
    Zowel Business Intelligence als Data Science draaien om slim gebruik van data. Business intelligence zorgt voor rapportages, zoals financiële rapporten, die een accuraat beeld schetsen van wat er heeft plaatsgevonden. Bij Data Science draait het om vooruitkijken met het vergroten van bedrijfswaarde als doel. Vanwege het experimentele karakter van Data Science hoeven uitkomsten niet altijd raak te zijn.  
    In de praktijk dragen dashboards, visualisaties en rapporten vaak bij aan de bewustwording over de waarde van data. Het is niet ongebruikelijk dat een directie een visie en strategie gericht op datagedreven toepassingen gaat ontwikkelen op basis van datavisualisaties en dashboards. 

    Voldoen bestaande organisatiestructuren nog wel?
    Organisaties die aan de slag gaan met datagedreven toepassingen doen er goed aan hun organisatie eens goed onder de loep te nemen. Innoveren draait niet om het schrijven van een Project Initiation Document (oftewel PID), maar om het simpelweg starten. Projectresultaten leiden niet altijd tot een valide business case, bij innovatie hoort ook falen. Kijk naar Google, toch een van de meest succesvolle organisaties wat betreft datatoepassingen, daar falen ook veel projecten. Het is zaak om te experimenteren en in korte iteraties te bepalen of je verder gaat of niet. Fail fast!

    Innoveren als een startup
    Waar Google, Microsoft en Apple de technologie zelf ontwikkelden in hun garage, zijn het nu startups die vaak starten met behulp van state-of-the art technologie die beschikbaar is als open source product. Studenten leren op de universiteit te werken met open source, technologie die ze ook thuis kunnen gebruiken. Organisaties die talent willen aantrekken zullen ook open source moeten adopteren om interessant te blijven als werkgever.
    Het nadeel van bestaande organisaties is dat de werkwijze zich vaak niet goed leent voor innovatie. Bij een online retailer werd een afdeling verantwoordelijk voor conversie. Vol enthousiasme ging de afdeling ‘Conversie’ aan de slag met het ontwikkelen van productaanbevelingen. Al vrij snel bleek het succes van de afdeling afhankelijk te zijn van de prestaties van andere afdelingen die andere targets nastreefden. De inkoper kocht volgens eigen KPI’s producten in en de marketeer bepaalde op zijn eigen manier de prijzen. De engineers en front-end developers bepaalden op basis van eigen testen de gebruikerservaring. Door de afhankelijkheid van andere afdelingen en conflicterende doelen per afdeling had de afdeling ‘Conversie’ dus feitelijk geen controle over zijn eigen succes.

    De enige manier om deze kloof te slechten is door te gaan werken in multidisciplinaire teams, die verantwoordelijk zijn voor features en niet voor processen. Deze teams kennen een heel andere dynamiek doordat verschillende disciplines samenwerken en samen dezelfde verantwoordelijkheid dragen, zoals bijvoorbeeld conversie. Startups hebben het wat dat betreft gemakkelijk, zij hebben geen bestaande organisatie, zij beginnen met het aantrekken van de juiste mensen en bouwen de skills gaandeweg op. Waar vroeger de systemen het kostbaarst waren, zijn het tegenwoordig de mensen die van de grootste waarde zijn.

    De rol van de Data Scientist
    Data Science heeft een centrale rol in teams die zich richten op innovatie en de ontwikkeling van datagedreven producten. Data Science is hiermee echt een businessafdeling en zeker geen ondersteunende afdeling die voor de business werkt. Een Data Scientist heeft over het algemeen ook een ander profiel dan een BI-specialist.
    Een Data Scientist is een soort van schaap met vijf poten. Een Data Scientist beschikt over het algemeen over een statistische achtergrond, heeft kennis van machine learning en bouwt naast modellen ook applicaties. Daarnaast is een Data Scientist communicatief vaardig en van nature nieuwsgierig, waardoor hij graag experimenteert en onderzoekt. Josh Wills, destijds verantwoordelijk voor Data Science bij Cloudera omschreef het als volgt: “Een Data Scientist is iemand die beter is in statistiek dan een software engineer en beter in software engineering dan een statisticus”. 

    Van BI naar Data Scientist
    Veel datawarehouse- en Business Intelligence-specialisten hebben programmeerervaring en zouden de stap naar Data Science kunnen zetten door zich bijvoorbeeld te verdiepen in Python en R en statistiek. Het helpt ook als organisaties functies creëren voor Data Scientists, niet alleen zodat externe consultancy-organisaties kennis kunnen overdragen maar ook zodat het voor bestaande medewerkers eenvoudiger wordt om door te groeien. Zodra organisaties de waarde erkennen van Data Science zal duidelijk worden dat het de mensen zijn die het verschil maken in de razendsnelle ontwikkeling van datatoepassingen en technologische innovatie.

    Bron: biplatform.nl

     

  • Van data driven naar data-informed besluitvorming

    intuitie 855x500Veel organisaties starten net met het data driven maken van hun besluitvorming, anderen zijn al verder gevorderd. De prominentere plaats van Big Data en algoritmen in besluitvorming van organisaties lijkt op het eerste gezicht alleen maar een positieve ontwikkeling. Wie wil er nou niet de customer journey kunnen volgen, de lead time verkorten en maximaal wendbaar zijn? Wie wil er geen slimme algoritmen waardoor complex speurwerk én moeilijke beslissingen geautomatiseerd worden?

    Besluitvorming, gedreven door Big Data en algoritmen, kent echter een aantal valkuilen: beslissingen, die teveel steunen op data, bevorderen een cultuur waarin medewerkers minder kritisch zijn, minder verantwoordelijkheid nemen en minder vertrouwen op hun eigen kennis en ervaring. Deze valkuilen zijn vooral van toepassing als de data en algoritmen nog niet ver genoeg ontwikkeld zijn, wat bij veel organisaties het geval is. Daarom pleiten wij voor ‘data-informed’ besluitvorming, waarin organisaties een balans vinden tussen enerzijds data en algoritmen, en anderzijds intuïtie, gestoeld op kennis en ervaring. In deze werkwijze is de medewerker nog in control. Hij verschuilt zich niet achter data en algoritmen, maar gebruikt deze om slimmere beslissingen te nemen.

    De upside van data driven besluitvorming

    De Big Data revolutie ontstond vanuit de groeiende aanwas en rijkere data die wordt verzameld en opgeslagen. Bovendien maakt slimme tooling het onttrekken en analyseren van data steeds gemakkelijker. Organisaties als Google, Tesla en de campagneteams van Hillary Clinton en Donald Trump zijn baanbrekend met hun datagedreven besluitvorming. Zo gebruikt Google Big Data en complexe algoritmen om advertenties te optimaliseren, zodat deze zo goed mogelijk bij de doelgroep aansluiten. Tesla zet sensoren en Big Data in om technische problemen op afstand te detecteren en te verhelpen (of zelfs te voorspellen en te voorkomen), waardoor recalls tot het verleden behoren. Dergelijke toepassingen zijn niet alleen weggelegd voor hippe startups, opgeschaalde multinationals of presidentskandidaten met veel geld. Datagedreven sturen kan iedereen door bijvoorbeeld met één proces of product te starten.

    Nederlandse vervoersbedrijven bepalen aan de hand van een voorspellend model de materieel- en personeelsinzet. Dit helpt hen om de mobiliteit tijdens pieken beter te stroomlijnen en geeft hen de kans om de dienstverlening keer op keer te verbeteren. Energiebedrijven gebruiken data voor het plegen van preventief onderhoud en het verduurzamen van hun processen. Profvoetbalclubs zetten tijdens wedstrijden data in om de klantbeleving te vergroten door spelers op het veld te volgen of zelf beelden te laten maken en te delen via social media en smartphones.

    De valkuilen van data driven besluitvorming

    Wanneer organisaties puur op basis van data en algoritmen beslissingen nemen, noemen we dat ‘data driven’ of ‘data centric’. Veel processen en zelfs beslissingen zijn (deels) geautomatiseerd, het menselijk brein verdwijnt naar de achtergrond en de data staat centraal in de besluitvorming. Wanneer algoritmen en data nog onvoldoende ontwikkeld zijn, verhoogt dit de kans op de volgende valkuilen:

    • Aannames worden onvoldoende getoetst;
    • Contextkennis wordt onvoldoende ingezet;
    • De data is onbetrouwbaar.

    Aannames worden onvoldoende getoetst

    In de aanloop naar de economische crisis van 2008 stuurden veel financiële instellingen op basis van risicomodellen die bijna niemand meer begreep. Het risico van hypotheekproducten schatten zij veel te laag in. Zij stelden de modellen nauwelijks ter discussie, maar gebruikten ze als verantwoording van correct handelen. Het resultaat: een systemische miscalculatie die bijna niemand zag aankomen, met desastreuze gevolgen.

    Dit voorbeeld illustreert dat het risicovol is om aannames van algoritmen niet of minder goed te laten toetsen door de mens én wat er gebeurt als we het vertrouwen in onze eigen intuïtie kwijtraken. Intuïtie kan een waardevolle toevoeging op data zijn, want met één van beiden dek je nog zelden de relevante werkelijkheid af.

    Contextkennis wordt onvoldoende ingezet

    Het CBS stelde dat Nederlanders in 2011 meer gingen lenen. Dit baseerden zij op hogere creditcardbestedingen. Maar wat was het geval? Nederlanders bestelden meer producten online en de creditcard was vaak het enige beschikbare betaalmiddel. Het CBS telde alle creditcardtransacties als leningen, ook gewone betalingen. Oftewel: iemand die online een boek of een vliegticket met een creditcard betaalde, was volgens het CBS iemand die niet meer bij de bank kon lenen en daarom zijn creditcard gebruikte.

    Dit voorbeeld illustreert het gevaar van het blind volgen van de data zonder contextkennis. Mét contextkennis had een analist op een lager detailniveau (type creditcardbesteding) geanalyseerd en geïnterpreteerd.

    De data is onbetrouwbaar

    In de campagne voor de presidentsverkiezingen van 2016 in de VS maakten zowel de teams van Hillary Clinton en Donald Trump gretig gebruik van Big Data en algoritmen. Onder meer voor nauwkeurige peilingen en efficiënte inzet van campagnemiddelen. Trump won, ondanks het beperkte budget (slechts de helft van Clinton). Het verhaal gaat dat de data van team Clinton minder betrouwbaar waren. Deelnemers van polls durfden tegenover haar team er niet voor uit te komen dat ze op Trump gingen stemmen. Tegen team Trump waren ze eerlijker. Zij zagen – tegen alle polls in – de overwinning al vijf dagen van te voren aankomen.

    Het vertrouwen in Big Data bij verkiezingscampagnes wordt nu ter discussie gesteld. Er was echter niets mis met de ontwikkelde algoritmen en de aanpak in het algemeen, maar met onbetrouwbare data zijn deze weinig waard of zelfs schadelijk, blijkt nu. Mensen kunnen nu eenmaal liegen of sociaal wenselijke antwoorden geven. In de sociale wetenschappen worden er niet voor niets allerlei strategieën toegepast om dit te minimaliseren. Het is dus belangrijk om aannames en datakwaliteit regelmatig te toetsen.

    Onjuiste of incomplete kennis kan desastreuze én onethische gevolgen hebben

    In het Amerikaanse rechtssysteem gebruiken ze geautomatiseerde data-analyse om de kans op recidive te berekenen. Er komt geen mens meer aan te pas. Ze crunchen de data en bepalen zo of iemand wel of niet vervroegd vrijkomt. Wetenschappers spreken over het doemscenario van volledig geautomatiseerde rechtspraak. Hoogleraar recht en informatisering Corien Prins: ‘Want op een gegeven moment is het uit je handen, dan heb je er niets meer over te zeggen.’

    Het belang van intuïtie

    Intuïtie wordt vaak als iets vaags of ongrijpbaars gezien. Dat heeft vooral met de definities te maken die worden gehanteerd: “iets aanvoelen zonder er over na te denken” of “het gevoelsmatig weten, zonder erover te hoeven nadenken”. Wat vaak wordt vergeten is dat intuïtie is opgebouwd op basis van kennis en ervaring. Hoe meer kennis en ervaring, hoe beter de intuïtie is ontwikkeld. Intuïtie wordt ‘bovenrationeel’ genoemd. Het werkt immers snel, moeiteloos en onbewust, in tegenstelling tot het ‘normale’ rationele denkproces, wat langzaam, complex en bewust is. Malcolm Gladwell beschreef in zijn boek Blink: The Power of Thinking Without Thinking dat bepaalde kunstcritici in een fractie van een seconde zien of een schilderij echt of namaak is, zonder dat ze daar direct een verklaring voor hebben. De ontwikkeling van kunstmatige intelligentie is nog niet zover dat zij deze experts kunnen vervangen.

    Beslissen op basis van intuïtie of onderbuikgevoel kent echter de nodige beperkingen. We hebben nogal wat vooroordelen (bias). Sommige waarheden zijn contra-intuïtief. Je denkt dat je alleen de boodschappen koopt die je echt nodig hebt. Wat blijkt: je maakt toch regelmatig gebruik van “drie-halen-twee-betalen”, waardoor je regelmatig voedsel weggooit. ‘Confirmation bias’ (tunnelvisie) is een veel voorkomende bias: we zien alleen de datapunten die in onze visie passen en alternatieven maken geen kans. Bovendien zijn we als mens niet in staat gigantische hoeveelheden data in korte tijd zonder rekenfouten te analyseren, zoals een computer dat kan. Bij deze menselijke tekortkomingen helpen data en algoritmen voor betere beslissingen.

    Van data driven naar data-informed

    Het is zaak om als organisatie geen genoegen te nemen met alleen data of alleen intuïtie. Het zijn twee bronnen die elkaar versterken. Wat is de optimale balans? Dat wordt met name bepaald door de stand van de technologie. Op gebieden waar algoritmen en kunstmatige intelligentie intuïtie nog niet kunnen vervangen, is het verstandig om ‘data-informed’ besluitvorming (zie Figuur) te hanteren. In deze aanpak is data niet leidend – zoals bij data driven besluitvorming – maar een verrijking van onze eigen capaciteiten. We hebben namelijk zelf onvoldoende mogelijkheden om alle informatie te kennen, te combineren, toe te passen en foutloos te werken. We hebben wel de kwaliteiten om niet-meetbare factoren mee te wegen, we kennen verklaringen en kunnen betekenis geven aan de data. En bovenal: we kunnen verantwoordelijkheid nemen. Data voorziet ons van informatie, maar wij gebruiken daarnaast intuïtie om beslissingen te nemen. Ditzelfde concept wordt toegepast in het vliegverkeer. Hoe goed de automatische piloot ook werkt, de menselijke piloot blijft eindverantwoordelijk. Zijn kennis en ervaring is nodig om besluiten te nemen, op basis van wat het vliegtuig voorstelt. Zowel data driven werken als volledig op basis van intuïtie werken kent dus beperkingen. Combineer het beste van beiden om als organisatie snel en gedegen besluiten te kunnen nemen.

    data driven data informed 1024x523

    Figuur. Data driven en data-informed (illustratie door Nick Leone, geïnspireerd op Fishman (2014) “The Dangers of Data Driven Marketing”).

    Case: Datagedreven verbeteren bij de Sociale Verzekeringsbank

    De Sociale Verzekeringsbank (SVB) wil hun klanten optimaal bedienen. Daarvoor is inzicht benodigd in de klantreis. De SVB brengt de digitale klantreis in beeld op basis van data, over de klantkanalen heen, met behulp van Process Mining. Deze data wordt uiteindelijk ingezet om de klantreis te sturen en te verbeteren. De SVB formuleerde onderzoeksvragen over de te verwachten klantreis. Bijvoorbeeld “Hoeveel klanten die een transactie uiteindelijk offline regelen zijn wel in de online portal geweest?” en “Op welke webpagina haken klanten af?” Data-analisten genereerden inzicht in de daadwerkelijke klantreis. Uit de data-analyse bleek bijvoorbeeld dat meer klanten dan verwacht afhaakten van online naar offline en dat zij dit vooral deden op een specifieke webpagina in de portal. De resultaten werden geduid door domeinexperts binnen de organisatie. Zij gaven direct aan dat het afhaken zeer waarschijnlijk een gevolg was van een extra authenticatie-stap. Na verdere analyse bleek dat deze stap vrij onverwacht in het proces kwam: de klant was hier niet voorbereid, waardoor zij het niet meer begrepen en/of zij niet bereid waren een extra stap te zetten. Op basis van de gezamenlijke conclusies zijn verbetervoorstellen uitgewerkt op gebied van proces, IT en webcontent. De effectiviteit hiervan is vervolgens weer getoetst door middel van data-analyse.

    Met alleen data had de SVB weinig inzicht gekregen in de context van de customer journey en beweegredenen van klanten en was er geen verbetering gerealiseerd. En met alleen intuïtie zou er veel minder inzicht in de daadwerkelijke klantreis zijn geweest. Klanten bewegen zich vaak anders dan men verwacht. Bovendien is (nog) niet elk gedrag en elke beweegreden van de klant in data te vatten.

    De basisingrediënten van data-informed werken

    Een data-informed besluitvormingscultuur herken je – naast het optimaal inzetten van data – aan kritisch denken, vertrouwen in eigen beoordelingsvermogen en (onderling) begrip van het waarom van besluiten. Een onderdeel daarvan is een periodieke toetsing van de beslismodellen. Bijvoorbeeld door regelmatig geautomatiseerde besluitvormingsprocessen achteraf te analyseren of door de feedback van klanten en andere stakeholders te gebruiken als input voor je beslismodellen. Deze cultuur van data-informed verbeteren vraagt om een datahuishouding die op orde is en expertise op gebied van data science.

    Tot slot nog een aantal concrete tips voor data-informed besluitvorming:

    • Zorg dat je personeelsbestand met data weet om te gaan. Om als organisatie competitief te zijn moeten de medewerkers kritisch zijn, complexe analyses kunnen uitvoeren en interpreteren, en acties kunnen definiëren.
    • Zorg dat je data blijft interpreteren en toetsen met je intuïtie en andersom. Bijvoorbeeld door met hypothesen of onderzoeksvragen te werken en niet te zoeken naar willekeurige verbanden. Dit scherpt je begrip over wat de data echt betekent en wat er werkelijk gebeurt in het proces of met de klant.
    • Innoveer en exploreer met nieuwe data-oplossingen in een ‘speeltuin’, om nieuwe analyses en analysemethoden te stimuleren. Implementeer deze zodra de oplossing getoetst is en de kwaliteit van de data en het algoritme op orde is.

    Source: managementsite.nl, 23 januari 2017

  • What are key trends in Big Data in 2017


    BDThe focus on big data in 2017 will be on the value of that data, according to John Schroeder, executive chairman and founder of MapR Technologies, Inc. Schroeder offers his predictions on the 6 trends in big data we can expect.

    1.Artificial Intelligence is Back in Vogue

    “In the 1960s, Ray Solomonoff laid the foundations of a mathematical theory of artificial intelligence, introducing universal Bayesian methods for inductive inference and prediction,” Schroeder explains. “In 1980 the First National Conference of the American Association for Artificial Intelligence (AAAI) was held at Stanford and marked the application of theories in software. AI is now back in mainstream discussions and the umbrella buzzword for machine intelligence, machine learning, neural networks, and cognitive computing. Why is AI a rejuvenated trend? The three V’s come to mind: Velocity, Variety and Volume. Platforms that can process the three V’s with modern and traditional processing models that scale horizontally providing 10-20X cost efficiency over traditional platforms. Google has documented how simple algorithms executed frequently against large datasets yield better results than other approaches using smaller sets. We'll see the highest value from applying AI to high volume repetitive tasks where consistency is more effective than gaining human intuitive oversight at the expense of human error and cost.”

    2.Big Data for Governance or Competitive Advantage

    “In 2017, the governance vs. data value tug-of-war will be front and center,” Schroeder predicts. “Enterprises have a wealth of information about their customers and partners. Leading organizations will manage their data between regulated and non-regulated use cases. Regulated use cases data require governance; data quality and lineage so a regulatory body can report and track data through all transformations to originating source. This is mandatory and necessary but limiting for non-regulatory use cases like customer 360 or offer serving where higher cardinality, real-time and a mix of structured and unstructured yields more effective results.”

    3.Companies Focus on Business- Driven Applications to Avoid Data Lakes From Becoming Swamps

    “In 2017 organizations will shift from the ‘build it and they will come’ data lake approach to a business-driven data approach,” says Schroeder. “Today’s world requires analytics and operational capabilities to address customers, process claims and interface to devices in real time at an individual level. For example any ecommerce site must provide individualized recommendations and price checks in real time. Healthcare organizations must process valid claims and block fraudulent claims by combining analytics with operational systems. Media companies are now personalizing content served though set top boxes. Auto manufacturers and ride sharing companies are interoperating at scale with cars and the drivers. Delivering these use cases requires an agile platform that can provide both analytical and operational processing to increase value from additional use cases that span from back office analytics to front office operations. In 2017, organizations will push aggressively beyond an “asking questions” approach and architect to drive initial and long term business value.”

    4.Data Agility Separates Winners and Losers

    “Software development has become agile where dev ops provides continuous delivery,” Schroeder says. “In 2017, processing and analytic models evolve to provide a similar level of agility as organizations realize data agility, the ability to understand data in context and take business action, is the source of competitive advantage not simply have a large data lake. The emergence of agile processing models will enable the same instance of data to support batch analytics, interactive analytics, global messaging, database and file-based models. More agile analytic models are also enabled when a single instance of data can support a broader set of tools. The end result is an agile development and application platform that supports the broadest range of processing and analytic models.”

    5.Blockchain Transforms Select Financial Service Applications

    “In 2017 there will be select, transformational use cases in financial services that emerge with broad implications for the way data is stored and transactions processed,” Schroeder explains. “Blockchain provides a global distributed ledger that changes the way data is stored and transactions are processed. The blockchain runs on computers distributed worldwide where the chains can be viewed by anyone. Transactions are stored in blocks where each block refers to the preceding block, blocks are timestamped storing the data in a form that cannot be altered. Hackers find it impossible to hack the blockchain since the world has view of the entire blockchain. Blockchain provides obvious efficiency for consumers. For example, customers won't have to wait for that SWIFT transaction or worry about the impact of a central datacenter leak. For enterprises, blockchain presents a cost savings and opportunity for competitive advantage.”

    6.Machine Learning Maximizes Microservices Impact

    “This year we will see activity increase for the integration of machine learning and microservices,” Schroeder says. “Previously, microservices deployments have been focused on lightweight services and those that do incorporate machine learning have typically been limited to ‘fast data’ integrations that were applied to narrow bands of streaming data. In 2017, we’ll see development shift to stateful applications that leverage big data, and the incorporation of machine learning approaches that use large of amounts of historical data to better understand the context of newly arriving streaming data.”

    Bron: Informatie Manegement, Januari 2017

  • Why B2B marketers shouldn't neglect B2C data

    Why B2B marketers shouldn't neglect B2C data

    Companies don’t buy goods and services, people do. And people buy for emotional reasons first. So, understanding what motivates people to buy is at the heart of learning why and how they consume. If you are focusing solely on B2B data, then you’re missing a critical piece of the equation.

    In the “age of the customer” where customers are in control, B2B marketers need to understand their prospects in new, sophisticated ways. This requires utilizing data about your buyers at work, but also outside of work.

    Typically, B2B data focuses on role and firmographic information. While B2C data can reveal information providing clues to the emotional reasons and process your customers use when making buying decisions. By combining B2C and B2B data, marketers can develop more relevant content and experiences that meet individual buyer needs. This is proven to increase the ability to contact and engage B2B buyers.

    ‘Integrated’ customer journey

    Customers know when they are being targeted, and often they don’t like it. Let’s say you have an insect problem, and you mention it to a neighbor. Next day a pest control salesman shows up at your door. While it’s convenient that the product arrived right when you needed it, you are naturally skeptical. You feel targeted. Modern day targeting strategy must be natural and non-intrusive. And data-led insight and context is required to achieve that.

    Meeting B2B sales objectives requires thinking big picture, beyond the business, to consider what’s happening in your customer’s life. Real people shift personas and uniforms throughout their day. From 9-5, B2B buyers assume their work persona. From 5-9 they assume their home, friends, family, and general B2C persona. Despite these shifts they are all integrated. What motivates and inspires, but also what scares a customer is essentially the same across work and personal life personas.

    How and why someone buys a specific car, house, vacation or clothing brand is directly related to how a person will acquire a server, services, or consulting.

    Let’s say your customer is passionate about a certain sports car brand. This could indicate that they have a more adventurous and aggressive attitude, which often translates to the same attitude at work. These insights can help B2B marketers craft messaging and offers that connect with these attitudes and leverage them toward their product.

    Cybersecurity for example may not seem like an exciting topic, but marketing it in a clever way can show the more adventurous consumers (who also make B2B decisions) that it’s worthwhile. HP’s campaign of movie shorts parodying the TV show Mr. Robot starred Christian Slater educating people about the importance of cybersecurity. It was a bold move that brought a lot of attention.

    Combining B2B and B2C data attributes are key to understanding the emotional and philosophical nature of your customers. When this is accomplished, messaging and creative and entices buyers to act can be created.

    Data-driven marketing

    Modern customers interacting with a company through different channels (store, website, social media, app) want it to be personal. Marketers who accomplish this across platforms will increase loyalty and trust.

    Data about your costumers must inform what you do. It’s not about applying B2C techniques to B2B marketing. It’s about using more data to become a better, more relevant marketer.

    Combining predictive analytics and machine-learning models with the millions of B2B and B2C data attributes we can collect about prospects nowadays provides the tools to connect 1:1 on a human level. Even better, we can use this data to increase B2B marketer’s ability to expand their reach.

    Connecting with customers is more complicated than ever and reaching them in a modern omni-channel world can be challenging. If you’re a B2B marketer, the first step is to use data to create a 360 degree view of your customer. When you manage to do so, you can reach more buyers with more relevant content and messaging in more mediums.

    Steve Jobs was probably right with this quote: “You’ve got to start with customer experience and work back toward the technology, not the other way around.” Incorporating B2C data attributes in B2B marketing gets to the heart of understanding your customer, creating tailored customer experiences and reaching them in more relevant media. And that’s definitely a good thing to keep in mind as you strive to improve ROI.

    Author: Collin Dayley

    Source: Insidebigdata

  • Why every business should pay attention to big data

    Big Data is a big deal. We currently live in an age where information is everywhere, and data is exponentially growing at a rapid pace. This data has the potential to enable machine-assisted decision making, automation and business optimization, essentially setting the groundwork for digital transformation.

    Yet 60% of global data and analytics decision makers say their company is sitting on 100 terabytes of data or more and are struggling to keep up with the overflowing amount of information. Studies have shown that 80% of the world’s data is unstructured, and remains locked within Enterprise Information Management (EIM) systems. Unstructured data encompasses information that is continuously produced by various sources from social media, IoT and smart devices, emails, voicemails, presentations, legal depositions, web pages, videos, and more.

    Unstructured content, on its own, or paired with structured data, can be put to work to refine a business’s strategy. But a major challenge that enterprises are struggling with today is first, understanding the complexity and volume of data that their business generates and second, knowing what to do with it.

    Unlocking the value of unstructured data

    The unstructured data being generated every day inside and outside businesses holds targeted, specific intelligence that is unique and valuable, and can be used to find the keys to current and future business drivers. By ignoring Big Data, businesses are unable to derive actionable insights that can lead to tangible outcomes. But done correctly, intelligent corporations can turn this huge stash of dusty data into richer insights and make data-driven decisions, while reducing cost and staying ahead of the competition.

    Data can be collected, fed into other applications, algorithms applied, information exchanged through deep learning and insights discovered to improve decision-making. For example, asset-intensive industries can put their data to work in the form of predictive maintenance and resource scheduling. Data can inform HR departments on hiring analysis to fill roles with the best candidates. Algorithms can be applied to help find ways to retain top talent based on measuring potential.

    But for businesses to thrive in today’s dynamic digital landscape, they need the right tools to unlock data from every known channel, with access to insights that lies in the deepest depths of an enterprise. Advances in text analytics and machine learning are giving companies more power to cross-examine unstructured content, rather than leaving them to rely on intuition and gut instinct.

    Practical AI: the right tools will bring positive outcomes

    There’s no doubt that Artificial Intelligence (AI) is changing the world. And it’s changing the enterprise first – including some of the most paper-intensive industries such as healthcare, government and banking. Utilizing practical uses of AI, like text mining and AI-augmented capture, enables organizations to bring their data to life regardless of the source.

    Text mining is a process that allows a machine to read unstructured textual data, which usually contains more valuable context and insights than its structured counterparts. With this technology, machines are able to learn to not only identify any mentions of people, places, things or events, but can also read written text and assign emotional tone to each of these mentions. Using the financial sector as an example, sophisticated analytics allows banks and financial organizations to spot and understand trends, like common product complaints or frequently asked questions.

    AI-augmented capture takes text mining a step further by capturing and interpreting content locked in documents, be them in paper form, scanned in, or in your digital files. It then uses AI to tread and understand the content in documents, classifying them more effectively, so that they then can be automatically routed to the right people, with the right priority level and in compliance with the sensitivity of the information in the document. For example, government entities and large enterprises struggle to grapple with the pure volume of incoming information as part of critical operations, from onboarding, invoicing, claims, customer correspondence, and more. But these tools largely reduce the need for error-prone, time-consuming manual intervention for all processes.

    Using the right tools can provide visibility into what customers are valuing at that moment and allows organizations to identify new product categories or business opportunities. By deploying these practical uses of AI and analytics, organizations will shift from being data-driven to insights-driven, which has become an absolute must for companies to keep up with competition in the modern day and age of the digital revolution.

    Author: Zachary Jarvinen

    Source: Dataversity

  • Why trusting your data is key in optimizing analytics

    Why trusting your data is key in optimizing analytics

    With the emergence of self-service business intelligence tools and platforms, data analysts and business users are now empowered to unearth timely data insights on their own and make impactful decisions without having to wait for assistance from IT. It's the perfect situation for more agile, insightful business intelligence and therefore greater business advantage, right?

    The reality is that even with these new BI tools at their fingertips, most enterprises still fall short of leveraging the real power of their data. If users don't fully trust the information (even if they're able to find and comprehend it), they won't use it when making business decisions. Until organizations approach their data analytics strategy differently - by combining all aspects of how the data is managed, governed, prepared, analyzed, and shared across the enterprise - a lack of trust will prevent a business' data from being useful and leading to successful business decisions, ultimately turning it into a liability rather than an asset.

    Finding the balance between agility and trust

    Although the self-service features of modern BI platforms offer more freedom and greater analytics power to data analysts and business users, they still require enterprises to manage and maintain data quality over time. Various roadblocks impede data analysts and business users from gaining access to the trusted data they need. Businesses can overcome common and critical challenges using tactics like:

    Building agility through proper data preparation

    Many times, data prep - the process of gathering, combining, cleaning, structuring, and organizing data - is missing from the analytics equation, especially when data analysts or business users are eager to get results quickly. However, having the data clearly structured with a common vocabulary of business terms (typically held in a business glossary of a data catalog) and data definitions ensures that people can understand the meaning of available data, instilling trust.

    Because data is pulled from both internal systems and external sources for reporting, profiling and cleansing data is essential to sevure trust in data as well as to improve the accuracy and reliability of results. Any changes made to the data should be tracked and displayed, providing users with the full history of the data should they have questions when using the data.

    Freeing (and maximizing) the siloed data

    Data is often siloed within different business units, enterprise applications, spreadsheets, data lakes etc., making it difficult to scale and collaborate with others. The rise of self-service BI has made this problem even more relevant as more business users and teams have generated department-specific reports. People working in one silo are likely unaware of what data has already been prepared and shared in other silos, so time is wasted by reinventing data prep efforts and analytics rather than reusing and sharing them.

    Integrating data prep with self-service analytics unifies teams across the enterprise - including shrinking gaps between data analysts and the people who have more context about the data - and empowers data scientists with trusted, curated data so they can focus less on hindsight and more on foresight.

    Establishing 'a true north' through data governance

    Strong data governance practices provide an organization with structure and security for its business data. This is especially critical when data is distributed through many systems, data lakes, and data marts. Governance is the umbrella term for all the processes and rules for data, including assigned owners and data lineage - so users can clearly understand the data's past use, who has accessed it, and what changes were made (if any).

    Maintaining balance

    For an organization to fully realize the value of its data, it needs a shared, user-friendly approach where all users within a business have easy access to data they can trust to do their jobs, but in a way that is controlled and compliant, protecting data integrity. Organizations can balance the demands for convenience and collaboration with those of control by establishing and maintaining a three-tiered approach. The three tiers in this approach are:

    1: The data marketplace

    Enterprisewide data use begins with the data marketplace, where business users can easily find (or shop for) the trusted business data they need to gain analytics insights for critical decisions. The data marketplace is where all the rules of governance, shared common data prep, and shared data silos come together.

    This data marketplace concept is not a single tool, platform, or device. No single self-service data analytics tool can deliver the results organizations are looking for. The data marketplace is an overarching strategy that addresses data management and discovery with prep and governance to collect trusted data. The marketplace helps organizations address the challenges of finding, sharing, transmitting, analyzing, and curating data to streamline analytics, encourage collaboration and socialization, and deliver results. Creating a standard, collaborative approach to producing trusted, reusable, and business-ready data assets helps organizations establish a common portal of readily consumable data for efficient business analysis.

    2: Team-driven analytics

    Just as important as having quick and easy access to reliable data is the ability to share data with others in a seamless, consumer-friendly way, similar to how sophisticated online music, movie, and shopping platforms do. Through the data marketplace mentioned above, users can visually see the origin and lineage of data sets just as a consumer can see background information about the musical artist of a song just streamed on Spotify. Through this visualization, users see consistency and relevancy in models across groups and teams, and even ratings on data utilization just as we use Yelp for reviews.

    Team commentary and patterns of data use dictate which models are most useful. Similar to sharing and recommending music to a friend, business users can collaborate and share data sets with other users based on previous insights they've uncovered. This team-driven and "consumerized" approach to data discovery and analytics produces quick and reliable business results.

    3: Augmented analytics

    A newer, more advanced feature of self-service analytics starting to emerge is augmented data insights: results based on machine learning and artificial intelligence algorithms. Using the Spotify example again, when augmented analytics is applied to the marketplace, data recommendations are made based on data sets the user has accessed, just as new music is recommended to consumers based on songs they've listened to earlier.

    By automatically generating data results based on previously learned patterns and insights, augmented analytics relieves a company's dependence on data scientists. This can lead to huge cost savings for organizations because data scientists and analysts are expensive to employ and often difficult to find.

    By creating this fully integrated approach to how enterprises view and use their data, a natural shift will start to occur for the organization, moving from self-service analytics to shared business intelligence and "socialization", where all users across the organization are encouraged to contribute to and collaborate on business data for greater value and business advantage.

    A common marketplace

    Organizations that have started to make this shift are already starting to see business benefits. Similar to consumer platforms like Spotify and Amazon, in an interactive community of trust, users thrive and are inspired to share and collaborate with others. I is through this collaboration that users gain instant gratification for more insightful decision-making. Through social features and machine learning, they learn about data sets they otherwise never would have known existed. Because analysts can see business context around technical data assets and build upon others' data set recipes and/or reuse models, they can achieve better, faster decision-making and work more efficiently.

    As data complexity increases, the key to realizing the value of business data is pulling all of the different data management and analytics elements together through a common marketplace with a constant supply chain of business-ready data that is easy to find, understand, share, and most off all trust. Only then business data becomes truly intelligent.

    Author: Rami Chahine

    Source: TDWI

EasyTagCloud v2.8