10 items tagged "Data warehousing"

  • ‘Vooruitgang in BI, maar let op ROI’

    5601405Business intelligence (bi) werd door Gartner al benoemd tot hoogste prioriteit voor de cio in 2016. Ook de Computable-experts voorspellen dat er veel en grote stappen genomen gaan worden binnen de bi. Tegelijkertijd moeten managers ook terug kijken en nadenken over hun businessmodel bij de inzet van big data: hoe rechtvaardig je de investeringen in big data?

    Kurt de Koning, oprichter van Dutch Offshore ICT Management
    Business intelligence/analytics is door Gartner op nummer één gezet voor 2016 op de prioriteitenlijst voor de cio. Gebruikers zullen in 2016 hun beslissingen steeds meer laten afhangen van stuurinformatie die uit meerdere bronnen komt. Deze bronnen zullen deels bestaan uit ongestructureerde data. De bi-tools zullen dus niet alleen visueel de informatie aantrekkelijk moeten opmaken en een goede gebruikersinterface moeten bieden. Bij het ontsluiten van de data zullen die tools zich onderscheiden , die in staat zijn om orde en overzicht te scheppen uit de vele verschijningsvormen van data.

    Laurent Koelink, senior interim BI professional bij Insight BI
    Big data-oplossingen naast traditionele bi
    Door de groei van het aantal smart devices hebben organisaties steeds meer data te verwerken. Omdat inzicht (in de breedste zin) een van de belangrijkste succesfactoren van de toekomst gaat zijn voor veel organisaties die flexibel in willen kunnen spelen op de vraag van de markt, zullen zijn ook al deze nieuwe (vormen) van informatie moeten kunnen analyseren. Ik zie big data niet als vervangen van traditionele bi-oplossingen, maar eerder als aanvulling waar het gaat om analytische verwerking van grote hoeveelheden (vooral ongestructureerde) data.

    In-memory-oplossingen
    Organisaties lopen steeds vaker aan tegen de performance-beperkingen van traditionele database systemen als het gaat om grote hoeveelheden data die ad hoc moeten kunnen worden geanalyseerd. Specifieke hybride database/hardware-oplossingen zoals die van IBM, SAP en TeraData hebben hier altijd oplossingen voor geboden. Daar komen nu steeds vaker ook in-memory-oplossingen bij. Enerzijds omdat deze steeds betaalbaarder en dus toegankelijker worden, anderzijds doordat dit soort oplossingen in de cloud beschikbaar komen, waardoor de kosten hiervan goed in de hand te houden zijn.

    Virtual data integration
    Daar waar data nu nog vaak fysiek wordt samengevoegd in aparte databases (data warehouses) zal dit, waar mogelijk, worden vervangen door slimme metadata-oplossingen, die (al dan niet met tijdelijke physieke , soms in memory opslag) tijdrovende data extractie en integratie processen overbodig maken.

    Agile BI development
    Organisaties worden meer en meer genoodzaakt om flexibel mee te bewegen in en met de keten waar ze zich in begeven. Dit betekent dat ook de inzichten om de bedrijfsvoering aan te sturen (de bi-oplossingen) flexibel moeten mee bewegen. Dit vergt een andere manier van ontwikkelen van de bi-ontwikkelteams. Meer en meer zie je dan ook dat methoden als Scrum ook voor bi-ontwikkeling worden toegepast.

    Bi voor de iedereen
    Daar waar bi toch vooral altijd het domein van organisaties is geweest zie je dat ook consumenten steeds meer en vaker gebruik maken van bi-oplossingen. Bekende voorbeelden zijn inzicht in financiën en energieverbruik. De analyse van inkomsten en uitgaven op de webportal of in de app van je bank, maar ook de analyse van de gegevens van slimme energiemeters zijn hierbij sprekende voorbeelden. Dit zal in de komende jaren alleen maar toenemen en geïntegreerd worden.

    Rein Mertens, head of analytical platform bij SAS
    Een belangrijke trend die ik tot volwassenheid zie komen in 2016 is ‘streaming analytics’. Vandaag de dag is big data niet meer weg te denken uit onze dagelijkse praktijk. De hoeveelheid data welke per seconde wordt gegenereerd blijft maar toenemen. Zowel in de persoonlijke als zakelijke sfeer. Kijk maar eens naar je dagelijkse gebruik van het internet, e-mails, tweets, blog posts, en overige sociale netwerken. En vanuit de zakelijke kant: klantinteracties, aankopen, customer service calls, promotie via sms/sociale netwerken et cetera.

    Een toename van volume, variatie en snelheid van vijf Exabytes per twee dagen wereldwijd. Dit getal is zelfs exclusief data vanuit sensoren, en overige IoT-devices. Er zit vast interessante informatie verstopt in het analyseren van al deze data, maar hoe doe je dat? Een manier is om deze data toegankelijk te maken en op te slaan in een kosteneffectief big data-platform. Onvermijdelijk komt een technologie als Hadoop dan aan de orde, om vervolgens met data visualisatie en geavanceerde analytics aan de gang te gaan om verbanden en inzichten uit die data berg te halen. Je stuurt als het ware de complexe logica naar de data toe. Zonder de data allemaal uit het Hadoop cluster te hoeven halen uiteraard.

    Maar wat nu, als je op basis van deze grote hoeveelheden data ‘real-time’ slimme beslissingen zou willen nemen? Je hebt dan geen tijd om de data eerst op te slaan, en vervolgens te gaan analyseren. Nee, je wilt de data in-stream direct kunnen beoordelen, aggregeren, bijhouden, en analyseren, zoals vreemde transactie patronen te detecteren, sentiment in teksten te analyseren en hierop direct actie te ondernemen. Eigenlijk stuur je de data langs de logica! Logica, die in-memory staat en ontwikkeld is om dat heel snel en heel slim te doen. En uiteindelijke resultaten op te slaan. Voorbeelden van meer dan honderdduizend transacties zijn geen uitzondering hier. Per seconde, welteverstaan. Stream it, score it, store it. Dat is streaming analytics!

    Minne Sluis, oprichter van Sluis Results
    Van IoT (internet of things) naar IoE (internet of everything)
    Alles wordt digitaal en connected. Meer nog dan dat we ons zelfs korte tijd geleden konden voorstellen. De toepassing van big data-methodieken en -technieken zal derhalve een nog grotere vlucht nemen.

    Roep om adequate Data Governance zal toenemen
    Hoewel het in de nieuwe wereld draait om loslaten, vertrouwen/vrijheid geven en co-creatie, zal de roep om beheersbaarheid toch toenemen. Mits vooral aangevlogen vanuit een faciliterende rol en zorgdragend voor meer eenduidigheid en betrouwbaarheid, bepaald geen slechte zaak.

    De business impact van big data & data science neemt toe
    De impact van big data & data science om business processen, diensten en producten her-uit te vinden, verregaand te digitaliseren (en intelligenter te maken), of in sommige gevallen te elimineren, zal doorzetten.

    Consumentisering van analytics zet door
    Sterk verbeterde en echt intuïtieve visualisaties, geschraagd door goede meta-modellen, dus data governance, drijft deze ontwikkeling. Democratisering en onafhankelijkheid van derden (anders dan zelfgekozen afgenomen uit de cloud) wordt daarmee steeds meer werkelijkheid.

    Big data & data science gaan helemaal doorbreken in de non-profit
    De subtiele doelstellingen van de non-profit, zoals verbetering van kwaliteit, (patiënt/cliënt/burger) veiligheid, punctualiteit en toegankelijkheid, vragen om big data toepassingen. Immers, voor die subtiliteit heb je meer goede informatie en dus data, sneller, met meer detail en schakering nodig, dan wat er nu veelal nog uit de traditionelere bi-omgevingen komt. Als de non-profit de broodnodige focus van de profit sector, op ‘winst’ en ‘omzetverbetering’, weet te vertalen naar haar eigen situatie, dan staan succesvolle big data initiatieven om de hoek! Mind you, deze voorspelling geldt uiteraard ook onverkort voor de zorg.

    Hans Geurtsen, business intelligence architect data solutions bij Info Support
    Van big data naar polyglot persistence
    In 2016 hebben we het niet meer over big, maar gewoon over data. Data van allerlei soorten en in allerlei volumes die om verschillende soorten opslag vragen: polyglot persistence. Programmeurs kennen de term polyglot al lang. Een applicatie anno 2015 wordt vaak al in meerdere talen geschreven. Maar ook aan de opslag kant van een applicatie is het niet meer alleen relationeel wat de klok zal slaan. We zullen steeds meer andere soorten databases toepassen in onze data oplossingen, zoals graph databases, document databases, etc. Naast specialisten die alles van één soort database afweten, heb je dan ook generalisten nodig die precies weten welke database zich waarvoor leent.

    De doorbraak van het moderne datawarehouse
    ‘Een polyglot is iemand met een hoge graad van taalbeheersing in verschillende talen’, aldus Wikipedia. Het gaat dan om spreektalen, maar ook in het it-vakgebied, kom je de term steeds vaker tegen. Een applicatie die in meerdere programmeertalen wordt gecodeerd en data in meerdere soorten databases opslaat. Maar ook aan de business intelligence-kant volstaat één taal, één omgeving niet meer. De dagen van het traditionele datawarehouse met een etl-straatje, een centraal datawarehouse en één of twee bi-tools zijn geteld. We zullen nieuwe soorten data-platformen gaan zien waarin allerlei gegevens uit allerlei bronnen toegankelijk worden voor informatiewerkers en data scientists die allerlei tools gebruiken.

    Business intelligence in de cloud
    Waar vooral Nederlandse bedrijven nog steeds terughoudend zijn waar het de cloud betreft, zie je langzaam maar zeker dat de beweging richting cloud ingezet wordt. Steeds meer bedrijven realiseren zich dat met name security in de cloud vaak beter geregeld is dan dat ze zelf kunnen regelen. Ook cloud leveranciers doen steeds meer om Europese bedrijven naar hun cloud te krijgen. De nieuwe data centra van Microsoft in Duitsland waarbij niet Microsoft maar Deutsche Telekom de controle en toegang tot klantgegevens regelt, is daar een voorbeeld van. 2016 kan wel eens hét jaar worden waarin de cloud écht doorbreekt en waarin we ook in Nederland steeds meer complete BI oplossingen in de cloud zullen gaan zien.

    Huub Hillege, principal data(base) management consultant bij Info-Shunt
    Big data
    De big data-hype zal zich nog zeker voortzetten in 2016 alleen het succes bij de bedrijven is op voorhand niet gegarandeerd. Bedrijven en pas afgestudeerden blijven elkaar gek maken over de toepassing. Het is onbegrijpelijk dat iedereen maar Facebook, Twitter en dergelijke data wil gaan ontsluiten terwijl de data in deze systemen hoogst onbetrouwbaar is. Op elke conferentie vraag ik waar de business case, inclusief baten en lasten is, die alle investeringen rondom big data rechtvaardigen. Zelfs bi-managers van bedrijven moedigen aan om gewoon te beginnen. Dus eigenlijk: achterom kijken naar de data die je hebt of kunt krijgen en onderzoeken of je iets vindt waar je iets aan zou kunnen hebben. Voor mij is dit de grootste valkuil, zoals het ook was met de start van Datawarehouses in 1992. Bedrijven hebben in de huidige omstandigheden beperkt geld. Zuinigheid is geboden.

    De analyse van big data moet op de toekomst zijn gericht vanuit een duidelijke business-strategie en een kosten/baten-analyse: welke data heb ik nodig om de toekomst te ondersteunen? Bepaal daarbij:

    • Waar wil ik naar toe?
    • Welke klantensegmenten wil ik erbij krijgen?
    • Gaan we met de huidige klanten meer 'Cross selling' (meer producten) uitvoeren?
    • Gaan we stappen ondernemen om onze klanten te behouden (Churn)?

    Als deze vragen met prioriteiten zijn vastgelegd moet er een analyse worden gedaan:

    • Welke data/sources hebben we hierbij nodig?
    • Hebben we zelf de data, zijn er 'gaten' of moeten we externe data inkopen?

    Databasemanagementsysteem
    Steeds meer databasemanagementsysteem (dbms)-leveranciers gaan ondersteuning geven voor big data-oplossingen zoals bijvoorbeeld Oracle/Sun Big Data Appliance, Teradata/Teradata Aster met ondersteuning voor Hadoop. De dbms-oplossingen zullen op de lange termijn het veld domineren. big data-software-oplossingen zonder dbms zullen het uiteindelijk verliezen.

    Steeds minder mensen, ook huidige dbma's, begrijpen niet meer hoe het technisch diep binnen een database/DBMS in elkaar zit. Steeds meer zie je dat fysieke databases uit logische data modelleer-tools worden gegeneerd. Formele fysieke database-stappen/-rapporten blijven achterwege. Ook ontwikkelaars die gebruik maken van etl-tools zoals Informatica, AbInitio, Infosphere, Pentaho et cetera, genereren uiteindelijk sgl-scripts die data van sources naar operationele datastores en/of datawarehouse brengen.

    Ook de bi-tools zoals Microstrategy, Business Objects, Tableau et cetera genereren sql-statements.
    Meestal zijn dergelijke tools initieel ontwikkeld voor een zeker dbms en al gauw denkt men dat het dan voor alle dbms'en toepasbaar is. Er wordt dan te weinig gebruik gemaakt van specifieke fysieke dbms-kenmerken.

    De afwezigheid van de echte kennis veroorzaakt dan performance problemen die in een te laat stadium worden ontdekt. De laatste jaren heb ik door verandering van databaseontwerp/indexen en het herstructureren van complexe/gegenereerde sql-scripts, etl-processen van zes tot acht uur naar één minuut kunnen krijgen en queries die 45 tot 48 uur liepen uiteindelijk naar 35 tot veertig minuten kunnen krijgen.

    Advies
    De benodigde data zal steeds meer groeien. Vergeet de aanschaf van allerlei hype software pakketten. Zorg dat je zeer grote, goede, technische, Database-/dbms-expertise in huis haalt om de basis van onderen goed in te richten in de kracht van je aanwezige dbms. Dan komt er tijd en geld vrij (je kan met kleinere systemen uit de voeten omdat de basis goed in elkaar zit) om, na een goede business case en ‘proof of concepts’, de juiste tools te selecteren.

  • Be careful when implementing data warehouse automation

    DWHAAutomation can be a huge help, but automating concepts before you understand them is a recipe for disaster.

    The concept of devops has taken root in the world of business intelligence and analytics.

    The overall concept of devops has been around for a while in traditional IT departments as they sought to expand and refine the way that they implemented software and applications. The core of devops in the world of analytics is called DWA (data warehouse automation), which links together the design and implementation of analytical environments into repeatable processes and should lead to increased data warehouse and data mart quality, as well as decreased time to implement those environments.

    Unfortunately, for several reasons the concept of data warehouse automation is not a silver bullet when it comes to the implementation of analytical environments.

    One reason is that you really shouldn't automate concepts before you fully understand them. As the saying goes, don't put your problems on roller skates. Automating a broken process only means that you make mistakes faster. Now, while I often advocate the concept of failing faster to find the best solution to an analytical problem, I don't really agree with the concept of provisioning flawed database structures very quickly only to rebuild them later.

    Another issue with applying devops to analytical practices is that the software development community has a 10-15 year head start on the analytical community when it comes to productizing elements of their craft.

    oftware developers have spent years learning how to best encapsulate their designs into object-oriented design, package that knowledge, and put it in libraries for use by other parts of the organization, or even by other organizations. Unfortunately, the design, architecture, and implementation of analytical components, such as data models, dashboard design, and database administration, are viewed as an art and still experience cultural resistance to the concept that a process can repeat the artistry of a data model or a dashboard design.

    Finally, there is the myth that data warehouse automation or any devops practice can replace the true thought processes that go into the design of an analytical environment.

    With the right processes and cultural buy-in, DWA will provide an organization with the ability to leverage their technical teams and improve the implementation time of changes in analytical environments. However, without that level of discipline to standardize the right components and embrace artistry on the tricky bits, organizations will take the concept of data warehouse automation and fail miserably in their efforts to automate.

    The following is good advice for any DWA practice:

    • Use the right design process and engage the analytical implementation teams. Without this level of forethought and cultural buy-in, the process becomes more of an issue than it does a benefit and actually takes longer to implement than a traditional approach.
    • Find the right technologies to use. There are DWA platforms available to use, but there are also toolsets such as scripting and development environments that can provide much of the implementation value of a data warehouse automation solution. The right environment for your team's skills and budget will go a long way to either validating a DWA practice or showing its limitations.
    • Iterate and improve. Just as DWA is designed to iterate the development of analytical environments, data warehouse automation practices should have the same level of iteration. Start small. Perfect the implementation. Expand the scope. Repeat.

    Source: Infoworld

  • Business Intelligence in 3PL: Mining the Value of Data

    data-mining-techniques-create-business-value 1In today’s business world, “information” is a renewable resource and virtually a product in itself. Business intelligence technology enables businesses to capture historical, current and predictive views of their operations, incorporating such functions as reporting, real-time analytics, data and process mining, performance management, predictive analytics, and more. Thus, information in its various forms and locations possesses genuine inherent value.
     
    In the real world of warehousing, the availability of detailed, up-to-the minute information on virtually every item in the operators’ custody, from inbound dock to delivery site, leads to greater efficiency in every area it touches. Logic would offer that greater profitability ensues.
     
    Three areas of 3PL operations seem to be most benefitted through savings opportunities identified through business intelligence solutions: labor, inventory, and analytics.
    In the first case, business intelligence tools can help determine the best use of the workforce, monitoring its activity in order to assure maximum effective deployment. The result: potentially major jumps in efficiency, dramatic reductions in downtime, and healthy increases in productivity and billable labor.
     
    In terms of inventory management, the metrics obtainable through business intelligence can stem inventory inaccuracies that would have resulted in thousands of dollars in annual losses, while also reducing write-offs.
     
    Analytics through business intelligence tools can also accelerate the availability of information, as well as provide the optimal means of presentation relative to the type of user. One such example is the tracking of real-time status of work load by room or warehouse areas; supervisors can leverage real-time data to re-assign resources to where they are needed in order to balance workloads and meet shipping times. A well-conceived business intelligence tool can locate and report on a single item within seconds and a couple of clicks.
     
    Extending the Value
    The value of business intelligence tools is definitely not confined to the product storage areas.
     
    With automatically analyzed information available in a dashboard presentation, users – whether in the office or on the warehouse floor – can view the results of their queries/searches in a variety of selectable formats, choosing the presentation based on its usefulness for a given purpose. Examples:
    • Status checks can help identify operational choke points, such as if/when/where an order has been held up too long; if carrier wait-times are too long; and/or if certain employees have been inactive for too long.
    • Order fulfillment dashboards can monitor orders as they progress through the picking, staging and loading processes, while also identifying problem areas in case of stalled processes.
    • Supervisors walking the floor with handheld devices can both encourage team performance and, at the same time, help assure efficient dock-side activity. Office and operations management are able to monitor key metrics in real-time, as well as track budget projections against actual performance data.
    • Customer service personnel can call up business intelligence information to assure that service levels are being maintained or, if not, institute measures to restore them.
    • And beyond the warehouse walls, sales representatives in the field can access mined and interpreted data via mobile devices in order to provide their customers with detailed information on such matters as order fill rates, on-time shipments, sales and order volumes, inventory turnover, and more.
    Thus, well-designed business intelligence tools not only can assemble and process both structured and unstructured information from sources across the logistics enterprise, but can deliver it “intelligently” – that is, optimized for the person(s) consuming it. These might include frontline operators (warehouse and clerical personnel), front line management (supervisors and managers), and executives.
     
    The Power of Necessity
    Chris Brennan, Director of Innovation at Halls Warehouse Corp., South Plainfield N.J., deals with all of these issues as he helps manage the information environment for the company’s eight facilities. Moreover, as president of the HighJump 3PL User Group, he strives to foster collective industry efforts to cope with the trends and issues of the information age as it applies to warehousing and distribution.
     
    “Even as little as 25 years ago, business intelligence was a completely different art,” Brennan has noted. “The tools of the trade were essentially networks of relationships through which members kept each other apprised of trends and happenings. Still today, the power of mutual benefit drives information flow, but now the enormous volume of data available to provide intelligence and drive decision making forces the question: Where do I begin?”
     
    Brennan has taken a leading role in answering his own question, drawing on the experience and insights of peers as well as the support of HighJump’s Enterprise 3PL division to bring Big Data down to size:
     
    “Business intelligence isn’t just about gathering the data,” he noted, “it’s about getting a group of people with varying levels of background and comfort to understand the data and act upon it. Some managers can glance at a dashboard and glean everything they need to know, but others may recoil at a large amount of data. An ideal BI solution has to relay information to a diverse group of people and present challenges for them to think through.”
     
    source: logisticviewpoints.com, December 6, 2016
  • Data warehouse automation: what you need to know

    data warehouseIn the dark about data warehousing? You’re not alone

    You would be forgiven for not knowing data warehousing exists, let alone that it’s been automated. It’s not a topic that gets a lot of coverage in the UK, unlike in the USA and Europe. It might be that Business Intelligence and Big Data Analytics are topics that have more ‘curb’ appeal. But, without data warehousing, data analytics would not generate the quality of business intelligence that organisations rely on. So what is a data warehouse and why did it need to be automated?

    Here’s what you need to know about data warehouse automation.

    In its most basic form a data warehouse is a repository where all your data is put, so that it can be analysed for business insight, and most business have one. Your customers will most likely have one because they need the kind of insight data analysis provides. Business Insight or Intelligence (BI) helps the business make accurate decisions, stay competitive and ultimately profitable.

    In retail, for example, the accurate and timely reporting of sales, inventory, discounts and profit is critical to getting a consolidated view of the business at all levels and at all locations. In addition, analysing customer data can inform businesses which promotions work, which products sell, which locations work best, what loyalty vouchers and schemes are working, and which are not. Knowing customer demographics can help retailers to cross or upsell items. By analysing customer data companies can tailor products to the right specification, at the right time thereby improving customer relations and ultimately increasing customer retention.

    Analysing all the data

    But, this is only part of the picture. The best intelligence will come from an analysis of all the data the company has. There are several places where companies get data. They usually have their own internal systems that have finance data, HR data, sales data, and other data specific to its business. In addition, most of your customers will now also collect data from the internet and social media (Big Data), with new data coming in from sensors, GPS and smart devices (IoT data). The data warehouse can pull any kind of data from any source into one single place for analysis. A lack of cross-pollination across the business can lead to missed opportunities and a limited corporate view.

    Previously, to get the data from its source (internal or external) into the data warehouse involved writing code by hand. This was monotonous, slow and laborious. It meant that the data warehouse took months to build, and then was rigidly stuck to the coding (and therefore design) it had been built with. Any changes that needed to be made, were equally slow and time consuming creating a frustration for both the IT and the Business. For the business, the data often took so long to be produced that it was out of date by the time they had it.

    Automation

    Things have moved on since the days of the traditional data warehouse and now the design and build of a data warehouse is automated, optimised and wizard driven. It means that the coding is generated automatically. With automation, data is available at the push of a button. Your customers don’t have to be an IT expert to create reports and employees don’t need to ask head office if they want information on a particular product line. Even more importantly, when you automate the data warehouse lifecycle you make it agile, so as your business grows and changes the warehouse can adapt. As we all know, it’s a false economy to invest in a short-term solution, which in a few years, will not be fit for purpose. Equally, it’s no good paying for excellent business intelligence tools and fancy reporting dashboards if the data underneath is not fully accessible, accurate and flexible.

    What does this mean for the channel?

    So now you know the importance of a data warehouse for data analytics, and how automation has brought data warehousing into the 21st century. So, what next? What does this mean for the channel?

    Not everyone in the channel will be interested in automation. Faster more efficient projects might not look like they will generate the immediate profit margins or revenue of a longer, slower one. But, innovative channel partners will be able to see that there are two clear advantages for them. One is that the projects, whilst shorter, never really end. This means there is a consistent stream of income. Secondly, by knowing about and offering your clients data warehouse automation the channel partner shows their expertise and consultancy abilities.

    The simple fact is that most companies have a data warehouse of some kind, from the giant supermarkets such as Tesco and Sainsbury, to smaller businesses like David Lloyd or Jersey Electricity. You don’t want to be the channel partner who didn’t know about or didn’t recommend the best, most efficient solution for your client. This could impact more than just the immediate sales. By educating your customers about the benefits of data warehouse automation you will bring them a wealth of efficiencies to their company, and most likely a wealth of future recommendations to yours.

    Source: ChannelPro

  • Decision making by smart technology

    bid-2015Zo heet het congres dat Heliview dinsdag 27 januari 2015 in ‘s Hertogenbosch organiseert over Business Intelligence & Datawarehousing. Business Intelligence blijft volgens vele bronnen op de prioriteitenlijst staan van Nederlandse organisaties. De hoeveelheid gestructureerde en ongestructureerde data neemt in recordtempo toe. Deze data is van onschatbare waarde voor organisaties. Business intelligence stelt organisaties in staat data op een slimme manier te verwerken tot de juiste informatie en daarmee tijd en geld te besparen en concurrentie voor te blijven. Slimme organisaties zijn steeds vaker ook succesvolle organisaties.  

    In het Heliview congres (dat onder dagvoorzitterschap staat van BI-kring initiatiefnemer Egbert Philips) staat de klassieke BI-driehoek centraal. Sprekers als Rick van der Lans, Arent van ‘t Spijker en vele anderen bespreken hoe organisaties betere beslissingen nemen door het slim en op maat inzetten van actuele technologische mogelijkheden op het gebied van data- en informatieverwerking. Voor wat betreft de techniek staan 27 januari centraal: social BI, mobile BI, business analytics en datawarehousing in de cloud.

    Lees hier neer over het congres

     

  • Drie componenten van Agile BI bij Alliander

    agile bi

    Het nutsbedrijf Alliander beheert energienetwerken die gas en elektriciteit distribueren in een groot deel van Nederland en heeft ongeveer 3,3 miljoen klanten. Alliander wil inspelen op de onvoorspelbaarheid van zowel de energiemarkt als de technologie ontwikkelingen en 

    een 'datagedreven' netbeheerder zijn.

    De beschikking hebben over state-of-the-art BI & Analytics oplossingen en toch datadumps moeten aanleveren voor Excel rapportages? Dit is niet iets wat een organisatie wenst, maar is vaak  wel de realiteit en eerlijk zijn: ook in uw organisatie. Men ervaart dat BI-trajecten te lang duren waardoor ‘Excelerados’ in elkaar gezet worden. Ook bij Alliander hebben we hiermee te maken en dit handmatige alternatief is uiteraard ongewenst. De aanwezigheid binnen Alliander van dure BI-oplossingen in een vastgeroeste, inflexibele architectuur met lange ontwikkeltijden is ook niet wenselijk. Daarom hebben we drie componenten toegepast bij het ontwikkelen van BI-projecten om meer agility te krijgen. Deze componenten zijn Scrum, een data provisioning layer gelijkwaardig het logische datawarehouse, en data profiling.

    Binnen Alliander onderkennen we minimaal vier probleemgebieden in de manier van werken met een verouderde architectuur: ‘Exelerados’, dure BI-oplossingen, inflexibiliteit en lange ontwikkeltijden. Daarom is een Agile productontwerp voor Alliander essentieel gebleken om onze ambitie te verwezenlijken en de onderkende uitdagingen aan te gaan. Het Agile product is tot stand gekomen met drie technieken: Alliander’s Data Provisioning Layer (DPL) als Logisch Datawarehouse; flexibel inspelen op veranderende informatiebehoeften met Agile Data Modeling (het zogenaamde account-based model); directe feedback van de eindgebruiker met behulp van Data Profiling en Scrum. We willen dit toelichten in een drietal blogs. Dit is de eerste.

    Data Provisioning Layer als Logisch Datawarehouse
    Het hart van agile productontwikkeling is de architectuur waarmee je werkt. De wendbaarheid van de architectuur is bij Alliander vormgegeven als een Logisch Datawarehouse. De Data Provisioning Layer (DPL) is het onderdeel van de architectuur dat als Logisch Datawarehouse ingezet wordt.

    De Data Provisioning Layer maakt data beschikbaar van verschillende traditionele bronnen (en bijvoorbeeld ook bestaande datawarehouses) die we kennen binnen Alliander, maar ook data van buiten Alliander, bijvoorbeeld data uit het Centrale Aansluit Register (CAR) of van de Kamer van Koophandel (KvK). En verder maakt de DPL ook real-time data beschikbaar, bijvoorbeeld uit het elektriciteits- of gasnet, om deze te kunnen combineren met andere data, zoals meetgegevens uit een onderstation (telemetrie).

    Door met views te werken in de DPL met daaronder virtuele informatiemodellen, maakt het voor de gebruikers van data geen verschil waar de data vandaan komt. Dit stelt ons in staat om bijvoorbeeld heel snel transactiegegevens uit ons ERP-systeem te combineren met geografische gegevens uit onze GIS-systemen, of met real-time data uit de netten.

    Een dashboard of andere toepassing is gebaseerd op een view uit de DPL, waarbij de data bijvoorbeeld direct uit een operational datastore of uit het bronsysteem komt. Als bijvoorbeeld wegens redenen van performance een dimensioneel datamodel nodig is, dan blijft het bestaande informatiemodel intact en daarmee ook de ontwikkelde toepassing voor een gebruiker.

    Conclusies
    Door de gevirtualiseerde DPL in te zetten volgens het concept van het Logisch Datawarehouse, zijn we in staat geweest om de volgende voordelen te behalen:

    • korte levertijden voor rapportages, door ontkoppeling van gegevensbron met gegevensgebruikers;
    • combinatie van externe met interne bronnen;
    • altijd beschikking over actuele gegevens;
    • benaderen van Big Data-bronnen.

    sam geurts

    De  gevirtualiseerde laag, aangeduid met DPL in bovenstaand schema, zorgt ervoor dat snellere integratie van de verschillende bronnen mogelijk is, wat tot betere resultaten leidt.

    In het tweede deel van deze blog gaan we in op de volgende toegepaste techniek: Agile Data Modeling.


    Hüseyin Kara is Senior BI Consultant bij Alliander.
    Sam Geurts is Scrum Master Data & Inzicht bij Alliander.

  • Five factors to help select the right data warehouse product

    meer-bronnenHow big is your company, and what resources does it have? What are your performance needs? Answering these questions and others can help you select the right data warehouse platform.

    Once you've decided to implement a new data warehouse, or expand an existing one, you'll want to ensure that you choose the technology that's right for your organization. This can be challenging, as there are many data warehouse platforms and vendors to consider.

    Long-time data warehouse users generally have a relational database management system (RDBMS) such as IBM DB2, Oracle or SQL Server. It makes sense for these companies to expand their data warehouses by continuing to use their existing platforms. Each of these platforms offers updated features and add-on functionality (see the sidebar, "What if you already have a data warehouse?").

    But the decision is more complicated for first-time users, as all data warehousing platform options are available to them. They can opt to use a traditional DBMS, an analytic DBMS, a data warehouse appliance or a cloud data warehouse. The following factors may help make the decision process easier.

    1. How large is your company?

    Larger companies looking to deploy data warehouse systems generally have more resources, including financial and staffing, which translates to more technology options. It can make sense for these companies to implement multiple data warehouse platforms, such as an RDBMS coupled with an analytical DBMS such as Hewlett Packard Enterprise (HPE) Vertica or SAP IQ. Traditional queries can be processed by the RDBMS, while online analytical processing (OLAP) and nontraditional queries can be processed by the analytical DBMS. Nontraditional queries aren't usually found in transactional applications typified by quick lookups. This could be a document-based query or a free-form search, such as those done on Web search sites like Google and Bing.

    For example, HPE Vertica offers Machine Data Log Text Search, which helps users collect and index large log file data sets. The product's enhanced SQL analytics functions deliver in-depth capabilities for OLAP, geospatial and sentiment analysis. An organization might also consider SAP IQ for in-depth OLAP as a near-real-time service to SAP HANA data.

    Teradata Corp.'s Active Enterprise Data Warehouse (EDW) platform is another viable option for large enterprises. Active EDW is a database appliance designed to support data warehousing that's built on a massively parallel processing architecture. The platform combines relational and columnar capabilities, along with limited NoSQL capabilities. Teradata Active EDW can be deployed on-premises or in the cloud, either directly from Teradata or through Amazon Web Services.

    For midsize organizations, where a mixture of flexibility and simplicity is important, reducing the number of vendors is a good idea. That means looking for suppliers that offer compatible technology across different platforms. For example, Microsoft, IBM and Oracle all have significant software portfolios that can help minimize the number of other vendors an organization might need. Hybrid transaction/analytical processing (HTAP) capabilities that enable a single DBMS to run both transaction processing and analytics applications should also appeal to midsize organizations.

    Smaller organizations and those with minimal IT support should consider a data warehouse appliance or a cloud-based data warehouse as a service (DWaaS) offering. Both options make it easier to get up and running, and minimize the administration work needed to keep a data warehouse functional. In the cloud, for example, Amazon Redshift and IBM dashDB offer fully managed data warehousing services that can lower up-front implementation costs and ongoing management expenses.

    Regardless of company size, it can make sense for an organization to work with a vendor or product that it has experience using. For example, companies using Oracle Database might consider the Oracle Exadata Database Machine, Oracle's data warehouse appliance. Exadata runs Oracle Database 12c, so Oracle developers and DBAs should immediately be able to use the appliance. Also, the up-front system planning and integration required for data warehousing projects is eliminated with Exadata because it bundles the DBMS with compute, storage and networking technologies.

    A similar option for organizations that use IBM DB2 is the IBM PureData System for Analytics, which is based on DB2 for LUW. Keep in mind, however, that data warehouse appliances can be costly, at times pricing themselves out of the market for smaller organizations.

    Microsoft customers should consider the preview release of Microsoft Azure SQL Data Warehouse. It's a fully managed data warehouse service that's compatible and integrated with the Microsoft SQL Server ecosystem.

    2. What are your availability and performance needs?

    Other factors to consider include high availability and rapid response. Most organizations that decide to deploy a data warehouse will likely want both, but not every data warehouse actually requires them.

    When availability and performance are the most important criteria, DWaaS should be at the bottom of your list because of the lower speed imposed by network latency with cloud access. Instead, on-premises deployment can be tuned and optimized by IT technicians to deliver increased system availability and faster performance at the high end. This can mean using the latest features of an RDBMS, including the HTAP capabilities of Oracle Database, or IBM's DB2 with either the IBM DB2 Analytics Accelerator add-on product for DB2 for z/OS or BLU Acceleration capabilities for DB2 for LUW. Most RDBMS vendors offer capabilities such as materialized views, bitmap indexes, zone maps, and high-end compression for data and indexes. For most users, however, satisfactory performance and availability can be achieved with data warehouse appliances such as IBM PureData, Teradata Active EDW and Oracle Exadata. These platforms are engineered for data warehousing workloads, but require minimal tuning and administration.

    Another appliance to consider is the Actian Analytics Platform, which is designed to support high-speed data warehouse implementation and management. The platform combines relational and columnar capabilities, but also includes high-end features for data integration, analytics and performance. It can be a good choice for organizations requiring both traditional and nontraditional data warehouse queries. The Actian Analytics Platform includes Actian Vector, a Symmetric Multiprocessor DBMS designed for high-performance analytics, which exploits many newer, performance-oriented features such as single instruction multiple data. This enables a single operation to be applied on a set of data at once and CPU cache to be utilized as execution memory.

    Pivotal Greenplum is an open source, massively parallel data warehouse platform capable of delivering high-speed analytics on large volumes of data. The platform combines relational and columnar capabilities and can be deployed on-premises as software or an appliance, or as a service in the cloud. Given its open source orientation, Pivotal Greenplum may be viewed favorably by organizations basing their infrastructure on an open source computing stack.

    3. Are you already in the cloud?

    DWaaS is probably the best option for companies that already conduct cloud-based operations. The other data warehouse platform options would require your business to move data from the cloud to an on-premises data warehouse. Keep in mind, though, that in addition to cloud-only options like Amazon Redshift, IBM dashDB and Microsoft Azure SQL Data Warehouse, many data warehouse platform providers offer cloud-based deployments.

    4. What are your data volume and latency requirements?

    Although many large data warehouses contain petabytes of raw data, every data warehouse implementation has different data storage needs. The largest data warehouses are usually customized combinations of RDBMS and analytic DBMS or HTAP implementations. As data volume requirements diminish, more varied options can be utilized, including data warehouse appliances.

    5. Is a data warehouse part of your big data strategy?

    Big data requirements have begun to impact the data warehouse, and many organizations are integrating unstructured and multimedia data into their data warehouse to combine analytics with business intelligence requirements -- aka polyglot data warehousing. If your project could benefit from integrated polyglot data warehousing, you need a platform that can manage and utilize this type of data. For example, the big RDBMS vendors -- IBM, Oracle and Microsoft -- are integrating support for nontraditional data and Hadoop in each of their respective products.

    You may also wish to consider IBM dashDB, which can process unstructured data via its direct integration with IBM Cloudant, enabling you to store and access JSON and NoSQL data. The Teradata Active EDW supports Teradata's Unified Data Architecture, which enables organizations to seamlessly access and analyze relational and nonrelational data. The Actian Analytics Platform delivers a data science workbench, simplifying analytics, as well as a scaled-out version of Actian Vector for processing data in Hadoop. Last, the Microsoft Azure SQL Data Warehouse enables analysis across many kinds of data, including relational data and semi-structured data stored in Hadoop, using its T-SQL language.

    Although organizations have been building data warehouses since the 1980s, the manner in which they are being implemented has changed considerably. After reading this four-part series, you should have a better idea of how modern data warehouses are built and what each of the leading vendors provides. Armed with this knowledge, you can make a more informed choice when purchasing data warehouse products.

    Source: TechTarget

  • Gartner positions Microsoft as a leader in the Magic Quadrant for Operational Database Management Systems

    Microsoft is placed furthest in vision and highest for ability to execute within the Leaders Quadrant.

    With the release of SQL Server 2014, the cornerstone of Microsoft’s data platform, we have continued to add more value to what customers are already buying.  Innovations like workload optimized in-memory technology, advanced security, high availability for mission critical workloads are built-in instead of requiring expensive add-ons. We have long maintained that customers need choice and flexibility to navigate this mobile-first, cloud-first world and that Microsoft is uniquely equipped to deliver on that vision in both trusted environments on-premises and in the cloud.

    Industry analysts have taken note of our efforts and we are excited to share Gartner has positioned Microsoft as a Leader, for the third year in a row, in the Magic Quadrant for Operational Database Management Systems. Microsoft is placed furthest in vision and highest for ability to execute within the Leaders Quadrant.

    Given customers are trying to do more with data than ever before across a variety of data types, at large volumes, the complexity of managing and gaining meaningful insights from the data continues to grow.  One of the key design points in Microsoft data strategy is ensuring ease of use in addition to solving complex customer problems. For example, you can now manage both structured and unstructured data through the simplicity of T-SQL rather than requiring a mastery in Hadoop and MapReduce technologies. This is just one of many examples of how Microsoft values ease of use as a design point. 

    Gartner also recognizes Microsoft as a leader in the Magic Quadrant for Business Intelligence and Analytics Platforms and placed Microsoft as a leader in the Magic Quadrant for Data Warehouse Database Management Systems – recognizing Microsoft’s completeness of vision and ability to execute in the data warehouse market.

    Offering only one piece of the data puzzle isn’t enough to satisfy all the different scenarios in today’s environments and workloads. Our commitment is to make it easy for customers to capture and manage data and to transform and analyze that data for new insights.

    Being named a leader in Operational DBMS, BI & Analytics Platforms, and DW DBMS Magic Quadrants is incredibly important to us: We believe it validates Microsoft is delivering a comprehensive platform that ensures every organization, every team and every individual is empowered to do more and achieve more because of the data at their fingertips.

     

  • Master Data Management and the role of (un)structured data

    MasterDataManagementTraditional conversations about master data management’s utility have centered on determining what actually constitutes MDM, how to implement data governance with it, and the balance between IT and business involvement in the continuity of MDM efforts.

    Although these concerns will always remain apposite, MDM’s overarching value is projected to significantly expand in 2018 to directly create optimal user experiences—for customers and business end users. The crux of doing so is to globalize its use across traditional domains and business units for more comprehensive value.

    “The big revelation that customers are having is how do we tie the data across domains, because that reference of what it means from one domain to another is really important,” Stibo Systems Chief Marketing Officer Prashant Bhatia observed.

    The interconnectivity of MDM domains is invaluable not only for monetization opportunities via customer interactions, but also for streamlining internal processes across the entire organization. Oftentimes the latter facilitates the former, especially when leveraged in conjunction with contemporary opportunities related to the Internet of Things and Artificial Intelligence.

    Structured and Unstructured Data

    One of the most eminent challenges facing MDM related to its expanding utility is the incorporation of both structured and unstructured data. Fueled in part by the abundance of external data besieging the enterprise from social, mobile, and cloud sources, unstructured and semi-structured data can pose difficulties to MDM schema.

    After attending the recent National Retail Federation conference with over 30,000 attendees, Bhatia noted that one of the primary themes was, “Machine learning, blockchain, or IoT is not as important as how does a company deal with unstructured data in conjunction with structured data, and understand how they’re going to process that data for their enterprise. That’s the thing that companies—retailers, manufacturers, etc.—have to figure out.”

    Organizations can integrate these varying data types into a single MDM platform by leveraging emerging options for schema and taxonomies with global implementations, naturally aligning these varying formats together. The competitive advantage generated from doing so is virtually illimitable. 

    Original equipment manufacturers and equipment asset management companies can attain real-time, semi-structured or unstructured data about failing equipment and use that to influence their product domain with attributes informing the consequences of a specific consumer’s tire, for example. The aggregation of that semi-structured data with structured data in an enterprise-spanning MDM system can influence several domains. 

    Organizations can reference it with customer data for either preventive maintenance or discounted purchase offers. The location domain can use it to provide these services close to the customer; integrations with lifecycle management capabilities can determine what went wrong and how to correct it. “That IoT sensor provides so much data that can tie back to various domains,” Bhatia said. “The power of the MDM platform is to tie the data for domains together. The more domains that you can reference with one another, you get exponential benefits.”

    Universal Schema

    Although the preceding example pertained to the IoT, it’s worth noting that it’s applicable to virtually any data source or type. MDM’s capability to create these benefits is based on its ability to integrate different data formats on the back end. A uniformity of schema, taxonomies, and data models is desirable for doing so, especially when using MDM across the enterprise. 

    According to Franz CEO Jans Aasman, traditionally “Master Data Management just perpetuates the difficulty of talking to databases. In general, even if you make a master data schema, you still have the problem that all the data about a customer, or a patient, or a person of interest is still spread out over thousands of tables.” 

    Varying approaches can address this issue; there is growing credence around leveraging machine learning to obtain master data from various stores. Another approach is to considerably decrease the complexity of MDM schema so it’s more accessible to data designated as master data. By creating schema predicated on an exhaustive list of business-driven events, organizations can reduce the complexity of myriad database schemas (or even of conventional MDM schemas) so that their “master data schema is incredibly simple and elegant, but does not lose any data,” Aasman noted.

    Global Taxonomies

    Whether simplifying schema based on organizational events and a list of their outcomes or using AI to retrieve master data from multiple locations, the net worth of MDM is based on the business’s ability to inform the master data’s meaning and use. The foundation of what Forrester terms “business-defined views of data” is oftentimes the taxonomies predicated on business use as opposed to that of IT. Implementing taxonomies enterprise-wide is vital for the utility of multi-domain MDM (which compounds its value) since frequently, as Aasman indicated, “the same terms can have many different meanings” based on use case and department.

    The hierarchies implicit in taxonomies are infinitely utilitarian in this regard, since they enable consistency across the enterprise yet have subsets for various business domains. According to Aasman, the Financial Industry Bank Ontology can also function as a taxonomy in which, “The higher level taxonomy is global to the entire bank, but the deeper you go in a particular business you get more specific terms, but they’re all bank specific to the entire company.” 

    The ability of global taxonomies to link together meaning in different business domains is crucial to extracting value from cross-referencing the same master data for different applications or use cases. In many instances, taxonomies provide the basis for search and queries that are important for determining appropriate master data.

    Timely Action

    By expanding the scope of MDM beyond traditional domain limitations, organizations can redouble the value of master data for customers and employees. By simplifying MDM schema and broadening taxonomies across the enterprise, they increase their ability to integrate unstructured and structured data for timely action. “MDM users in a B2B or B2C market can provide a better experience for their customers if they, the retailer and manufacturer, are more aware and educated about how to help their end customers,” Bhatia said.

     

    Author: Jelani Harper

    Source: Information Management

  • Three trends that urge to modernization of data warehouses

    Modern Data Warehouse image Jan 2017

    In the last couple of years, we’ve seen the rapid adoption of machine learning into the analytics environment, moving from science experiment to table stakes. In fact, at this point, I’m hard pressed to think of an enterprise that doesn’t have at least some sort of predictive or machine learning strategy already in place.

    Meanwhile, data warehouses have long been the foundation of analytics and business intelligence––but they’ve also traditionally been complex and expensive to operate. With the widespread adoption of machine learning and the increasing need to broaden access to data beyond just data science teams, we are seeing a fundamental shift in the way organizations should approach data warehousing.

    With this in mind, here are three broad data management trends I expect will accelerate this year:

    Operationalize insights with analytical databases

    I’m seeing a lot of convergence between machine learning and analytics. As a result, people are using machine learning frameworks such as R, Python, and Spark to do their machine learning.

    They also then do their best to make those results available in ways that are accessible to the rest of the business beyond only data scientists. These talented data scientists are hacking away using their own tools but these are just not going to be accessed by business analysts. 

    How you get the best of both worlds is to allow data scientists to use their tools of choice to produce their predictions, but then publish those results to an analytical database, which is more open to business users. The business user is already familiar with tools like Tableau, so by using an analytical database they can easily operationalize insights from the predictive model outcomes.

    Growth in streaming data sources

    Similar to the convergence of machine learning and analytics, I’m also seeing much greater interest in how to support streaming use cases or streaming data sources. 

    There are a number of technologies, among them Kafka, that provide a way to capture and propagate streams and do stream-based processing. Many systems from web analytics stacks to a single microservice in someone’s application stack are pushing out interesting events to a Kafka topic. But how do you consume that? 

    There are specialized streaming databases, for example, that allow you to consume this in real time. In some cases that works well but in others it's not as natural, especially when trending across larger data ranges. Accomplishing this is easier by pushing that streaming data into an analytics database.

    The ephemeral data mart

    The third trend I’m seeing more of, and I expect to accelerate in 2018, is what I would call the ephemeral data mart. 

    What I mean by that is to quickly bring together a data set, perform some queries, and then the data can be thrown away. As such, data resiliency and high availability become less important than data ingestion and computation speed. I’m seeing this in some of our customers and expect to see more.

    One customer in particular is using an analytics database to do processing of very large test results. By creating an ephemeral data mart for each test run, they can perform post-test analysis and trending, then just store the results for the longer term. 

    As organizations need better and more timely analytics that fit within their hardware and cost budgets, it’s changing the ways data is accessed and stored. The trends I’ve outlined above are ones that I expect to gather steam this year, and can serve as guideposts for enterprises that recognize the need to modernize their approach to data warehouses.

    Author: Dave Thompson

    Source: Information Management

EasyTagCloud v2.8