7 items tagged "tools"

  • BC - Business & Competitive - Intelligence

    BC (Business & Competitive) Intelligence

    Business Intelligence is zo´n begrip dat zich nauwelijks letterlijk laat vertalen. Bedrijfsintelligentie zou in de buurt kunnen komen, maar valt net als andere vormen van intelligentie moeilijk precies te duiden. Bedrijfsinzicht of -begrip komen wellicht nader in de buurt. Andere benaderingen (van andere auteurs) voegen daar bedrijfs- of omgevingsverkenningen als alternatieve vertaling aan toe.

    Om Business en Competive Intelligence goed te begrijpen maken we hier gebruik van een analytisch schema (tabel 1.1). Daarmee wordt het mogelijk de verschillende verschijningsvormen van BI te onderscheiden en daarmee de juiste variant bij het juiste probleem toe te passen. Belangrijk is dat het hierbij gaat om stereotypen! In de praktijk komen mengvormen voor.

    Het uitgangspunt is dat BI wordt gezien als een informatieproces waarbij met behulp van data, kennis of inzicht wordt geproduceerd.

     

    Data over de interne bedrijfsvoering

    Data over de bedrijfsomgeving

    Bedrijfskundige

    benadering

    A

    B

    Technologische

    benadering

    C

    D

    Tabel 1.1

    In de tabel is een bedrijfskundige van een technologische benadering te onderscheiden. BC Intelligence behandeld BI vanuit de bedrijfskundige processen die dienen te worden ondersteund. Er bestaat ook een technologisch perspectief op BI. Het uitgangspunt van deze benadering is veeleer te profiteren van de mogelijkheden die informatietechnologie biedt om bedrijfsinzicht te verkrijgen.. Op de andere as in het schema worden data over de interne bedrijfsvoering (interne data) van data over de bedrijfsomgeving (externe data) onderscheiden. We spreken met nadruk over onderscheiden in plaats van gescheiden categorieën. In de gebruikspraktijk blijken de categorieën namelijk wel te onderscheiden maar nauwelijks te scheiden. Ze kunnen niet zonder elkaar en zijn vaak ondersteunend of complementair.

    Business Intelligence

     

    Data over de interne bedrijfsvoering

    Data over de bedrijfsomgeving

    Bedrijfskundige

    benadering

    A

    B

    Technologische

    benadering

    C

    D

    Hoewel het onderscheid arbitrair is en de term BI net zo goed voor het totale quadrant gereserveerd zou kunnen worden (met CI als deelverzameling) hebben veel BI projecten betrekking op de cellen A en C.

    BI gaat dus vaak op het optimaliseren van bedrijfsprocessen waarbij het accent ligt op het verwerwen van bedrijfsinzicht uit data over de onderneming zelf. Deze data genereren doorgaans kennis over de huidige situatie van de onderneming. Kennis die voor strategievorming en optimalisatie van bedrijfsresultaten (denk aan Performance Management) onontbeerlijk is.

    De technoloische component van BI wordt  door cel C gerepresenteerd. Helaas heeft deze invalshoek bij veel dienstverleners de overhand. Het accent ligt daarbij op de inrichting van een technologische infrastructuur die adequate kennis over de onderneming en haar prestaties mogelijk maakt. In cel C kunnen daarom zowel ETL-tools, datawarehouses als ook analytische applicaties worden gedacht.

    Redactioneel BI-kring:

    In de cel A hebben wij eigenlijk nauwelijks een categorie gedefinieerd.  Wat mij betreft zou daarPerformance Management thuis horen. Die term zou ik dus willen toevoegen. Als Key words kun je denken aan: Key Performance Indicators, Performance Process Management, Organizational Performance,  PDCA (Plan Do Check Act) Cycle, Performance Planning.

    Voor wat betreft C kunnen we verwijzen naar bovenstaande tekst:Datawarehousingen OLAPzijn daar de centrale elementen.Key words zijn dat databases, ETL (Extraction, Transformation and Load), , architecture, data dictionary, metadata, data marts.

    Met betrekking tot OLAP zijn key words:analytische applicaties, reporting, queries, multidimensionale schema’s, spreadsheet, Kubus, data mining.

    Competitive Intelligence

     

    Data over de interne bedrijfsvoering

    Data over de bedrijfsomgeving

    Bedrijfskundige

    benadering

    A

    B

    Technologische

    benadering

    C

    D

    CI is het proces waarin data over de omgeving van de onderneming in een informatieproces worden getransformeerd in ´strategisch bedrijfsinzicht´. Hoewel de term Competitor en Competitive Intelligence vanaf de tachtiger jaren wordt gebruikt heeft deze benadering ook in de jaren zeventig al aandacht gehad onder de noemer 'environmental scanning'.

    CI speelt een belangrijke rol in strategische maar ook andere bedrijfskundige processen. Prestaties van de onderneming, concurrentiepositie, mogelijke toekomstige posities als ook innovatievermogen kunnen slechts worden bepaald met behulp van kennis over de bedrijfsomgeving.

    Redactioneel BI-kring:

    Competitive Intellience heeft dus te maken met alle informatievoorziening die wordt georganiseerd om de concurrentiepositie van ondernemingen te kunnen bepalen, beoordelen en veranderen. Het raakt dus direct aan strategie, strategische intelligence, concurrentie-analyse, concurrentiepositie, en alle intelligence die nodig is om de positie van de onderneming in de omgeving goed te kunnen beoordelen.

    Het organiseren van CI is in organisaties nog steeds zwaar onderbelicht. Het blijkt moeilijk structuur aan te brengen in de noodzakelijke informatieprocessen als ook om ze uit te voeren. De inrichting van een ´systeem´ dat dit proces zou moeten realiseren staat in het middelpunt van de aandacht maar is voor veel organisaties ook nog een brug te ver. Een verantwoorde ontwikkelbenadering vergroot de succeskansen echter aanzienlijk.

    Data over de bedrijfsomgeving zijn vaak ongestructureerd van aard en in de organisatie voorhanden. De kunst is deze data beschikbaar te maken voor de besluitvorming. Wanneer de data niet in de onderneming beschikbaar is verschillen de technieken en instrumenten die moeten worden ingezet om deze data te ontsluiten van de bij BI gebruikte technieken. De technieken varieren vandocumentmanagementsystemen tot information agents die zelfstandig het internet afzoeken naar interessante bouwstenen (data!). Bij het structureren en analyseren van de ongestructureerde documenten wordt text mining gebruikt (in geval van www; web-content-mining).

    Redactioneel BI-kring

    Om competitive Intelligence adequaat te ondersteunen en met name ook primaire data beschikbaar te maken ten behoeve van het proces zijn Collaboration tools populair. Het gaat hier over kennismanagement achtige systemen en shareware toepassingen die de data-, informatie- en kennisdeling faciliteren. Key words: kennismanagagement, shareware, sharepoint, knowledge management.

    Overzicht data categorieen BI-kring

    Cel A format

    • Performance Management

    Key words:Key Performance Indicators, Performance Process Management, Organizational Performance,  PDCA (Plan Do Check Act) Cycle, Performance Planning

    Cel C format

    • Datawarehousing

    Key words:databases, ETL (Extraction, Transformation and Load), , architecture, data dictionary, metadata, data marts., Big Data

    • Online Analytical Processing

    Key words:analytische applicaties, reporting, queries, multidimensionale

     schema’s, spreadsheet, Kubus, data mining, dashboarding.

    Cel B format

    • Competitive Intelligence

    Key words:strategie, strategische intelligence, concurrentie-analyse, concurrentiepositie, competitor intelligence, technological intelligence, environmental scanning, environmental intelligence.

    • Content (Competitive Intelligence als product)

         Key words:

    Cel D format

    • Collaboration

    Key words:kennismanagagement, shareware, sharepoint, knowledge management.

    • Search methodologies

                     Key words:documentmanagement systemen, spider technologie,

    ongestructureerde informatie, information agents, text mining, content mining. Search technologies.

    Integraal tav hele schema (intelligente organisaties hebben het hele model integraal geimplementeerd)

    • Intelligente organisatie

    Key words:Management informatie, Intelligente organisatie, lerende organisatie, organisationeel leren, leren, Intelligence change management.

     

    Bron: Egbert Philips

     

  • De tijd van excel nu echt over?

    29 Mei 2015

    De hoeveelheid data waarmee een marketeer of sales verantwoordelijke moet werken zijn de afgelopen jaren enorm toegenomen. Die ontwikkeling is exponentieel. Met de opkomst van e-commerce, de introductie van klantprogramma’s en de opkomst van beacons neemt de hoeveelheid data die we van consumenten krijgen alleen maar toe.

    De praktijk is echter wel dat veel marketeers nog werken met Excel om hun campagnes, business cases en planningen te managen. Als je voor al je klanten een persoonlijke campagne wilt samenstellen dan is dit niet meer houdbaar. Excel is niet te koppelen aan online tools, versiebeheer is een groot probleem en de inhoud is per definitie achterhaald als het bestand geopend wordt. Wij voorspellen dat in 2015 marketeers en salesverantwoordelijken overstappen op online tools om hun proces beter te managen én de data daaruit direct te gebruiken om persoonlijker en relevanter te worden.

    Meer weten over deze tools en de expertise gebieden die noodzakelijk zijn om ze goed te implementeren? Volg BI-kring de komende maanden.

    BI-kring redactie

  • Don't Get Ubered: Rethinking Your Competitive Intelligence Approach

    market intelligence“‘You’ve been Ubered’ will become part of our lexicon to describe industries blindsided by the future,”says Tony Chapman, a Canadian consumer and branding expert, in reference to the challenges that “Big Taxi” is currently facing due to the growing popularity of the Uber rideshare app.


    In fact, every industry now has to face the fact that you “are either Uber, or you’re being Ubered.”

    So, while it is important to look at what your direct competitors are doing when evaluating the competitive landscape for your ecommerce business, it’s even more important to be aware of key trends that will shape where your market is headed.

    To help you create a competitive business strategy that keeps your business agile and adaptable to continuously evolving market conditions and competitors, let’s take a look at what some business strategists and analysts are recommending to startups and entrepreneurs.  After all, most businesses will need to adopt startup strategies in order to remain relevant 10 to 20 years from now. 

    Understanding Your Competitive Landscape

    To understand the big picture of how your business can thrive in the future, it helps to look at your competitive landscape from many different angles. The handy SlideShare presentation from Startup Next (see full presentation above) on market sizing and competitive analysis recommends to evaluate 3 competitive categories in your environment which include:

    1. Direct competitors

    • Big retailers in your industry (e.g. Amazon and Walmart)
    • Other businesses that “solve your unique problem” (including smaller, niche mom and pop shops or even Etsy shops)

    2. Indirect competitors

    • Economic and political trends (e.g. customer privacy)
    • Regulation, government legislation and trade agreements

    3. Future threats

    • Cultural shifts (e.g. usage of mobile devices overtaking desktop computers)
    • Tech innovations (e.g. wearables, 3D printing and virtual reality)
    • Possible changes that your business partners or suppliers are planning

    Once you know who those competitors are, it’s time to evaluate the market opportunities and consumer touchpoints related to each of them.

    A Startup Approach For Evaluating Future Competitors And Market Opportunities

    “We need a different way to represent the competitive landscape when you are creating a business that never existed or taking share away from incumbents by re-segmenting an existing market,” saysSteve Blank, a serial entrepreneur, Stanford professor and author whose book The Four Steps to the Epiphany was influential in the launch of the Lean Startup movement.

    So, if every business now needs to think like a startup in order to avoid “getting Ubered,” why not begin evaluating your market opportunities like a startup right now?

    Image via SteveBlank.com

    To do so, Blank recommends putting your business at the center of your competitive analysis diagram (versus plotting it out on an x, y axis – with your startup at the top right) and then branch out to key adjacent market segments that exist today. This will help to identify where you think your new customers might come from in the future.

    He calls this a “Petal Diagram” and his argument for this approach is that he always thought of his startups as “the center of the universe.” The example diagram above is for a startup education software platform. But the same format could be applied to any retail or ecommerce business.

    So, say your ecommerce business sells specialty sports equipment. You may want to add market segments that you don’t cater to right now but might in the future – thanks to new technologies. For example, advances in 3D printing technologies will allow you to sell to customers in countries to which you don’t currently ship your products. Or, you may be able to work with new distributors or manufacturers with whom you don’t currently have business relationships.

    Blank says that there is no limit to the number of “petals” or adjacent markets that you can map out. And for better visualization, he recommends that the size of the petals can be scaled to the size of the market opportunity for each segment.

    Image via SteveBlank.com

    “The petal diagram is where you [startups an entrepreneurs] develop your first hypothesis about who your customers are,” says Blank.

    While you probably already know who your existing customers are if you are running a thriving ecommerce business, it’s still important to consider that your future customers may look and behave a lot differently.

    And although the purpose of the petal diagram is to show potential investors why they should put their money into a startup, the same diagram can help your leadership team decide where to place their biggest bets and/or to allocate budgets towards R&D for future business growth.

    Tools For Evaluating Your Competition Online


    Image via Pixabay

    Whether you want to size-up your direct competitors, or research current and evolving industry trends, there’s an app or (sometimes free) online tool for that.

    Below are a few suggestions for where you can gather useful competitive intelligence data.

    1. Upstream Commerce
    While you do have to pay for this service, Upstream Commerce offers “automated, real-time intelligence analytics” and insights to help you evaluate competitive pricing, merchandising, promotion and product intelligence – across a number of retail industry categories. The company boasts that its data is easily customizable to help you build detailed, filterable results.

    2. Channel IQ
    Similar to Upstream Commerce, Channel IQ offers competitive intelligence analytics for price monitoring, product intelligence and more. But what sets it apart is that it offers competitive brand intelligence and protection tools. This includes paid search brand protection – allowing you to monitor PPC & keyword “brand-jacking,” so that your competitors can’t “illegally divert your traffic using your registered trade name.”

    3. The Google Keyword Planner tool (or other similar free tools) and Google Trends can help to evaluate consumer demand (via search queries) for your competitors’ products. These tools can also help you evaluate demand for specific products that you or your competitors carry.

    In addition, by signing up to receive Google Alerts via email whenever your competitors are covered by media or bloggers online, you can stay up-to-date on when they launch new products or when the company is receiving positive or negative press.

    4. Social media monitoring platforms
    In addition to looking at online search behavior, it’s important to look at what people are actually saying about your brand – and whether the conversation is moving in a positive or negative direction.

    There are a lot of social media monitoring platforms available for listening to what customers are saying on social networking sites. Some of the more popular ones include: HootSuite, TweetDeck andSysomos. Here’s a helpful blog post from Ryan Holmes, the CEO of HootSuite on how to “listen” to the competition via social media.

    5. eMarketer.com
    While the research, insights and benchmark reports written by eMarketer analysts are crafted with a marketing slant, the company’s ecommerce and mobile commerce reports provide rich, aggregated data from some of the most important research companies that study digital trends today.

    And even if you can’t afford to pay for their full reports, if you sign-up for their free newsletter and search through their public articles, you can access a lot of the most important highlights and charts for use in your own competitive intelligence analysis and strategy development.

    6. Alexa.com and other web analytics tools
    While the insights that you can glean from the free version of the Alexa tool are limited, you can still gain a high-level overview of:

      • how your competitors’ websites rank online (both worldwide or in your own country),
      • how their website performs overall (via graphs highlighting the bounce rate, pageviews per visitor and average visitor’s time spent on the website), and
      • where they may be investing in online marketing efforts via the top “upstream websites” graph (i.e. the top websites that send traffic to your competitors’ websites).

    The paid version of the tool is more robust – giving you further intel into sites linking in, keywords driving traffic to the site and overall website comparisons.

    7. comScore and HitWise
    These tools also do a lot of what the paid version Alexa tool does and are pretty popular for gathering online consumer behavioral data. And like Alexa, both products will enable you to look at traffic on your competitors’ sites – offering variations on how people get there, what popular search terms were used and who those people are. But the way the data is collected is different: comScore data is collected via a panel of users who opt-in to be tracked, and HitWise data is collected based on aggregated ISP user data.

    If you have the budget, then I’d say pay for both of these tools. If not, the free Alexa tool is a great place to start.

    8. Finally, although Mary Meeker is a person (and a very influential one at that) and obviously not an app, her annual Internet Trends Report has become an important destination for anyone who wants to know what’s happening online today – or prepare for will happen in the future. So, I suggest you add her slide deck to your “future threats” intelligence arsenal.

    For a list of even more competitive intelligence tools, check out this post on the Shopify blog.

     

  • Essential Data Science Tools And Frameworks

    Essential Data Science Tools And Frameworks

    The fields of data science and artificial intelligence see constant growth. As more companies and industries find value in automation, analytics, and insight discovery, there comes a need for the development of new tools, frameworks, and libraries to meet increased demand. There are some tools that seem to be popular year after year, but some newer tools emerge and quickly become a necessity for any practicing data scientist. As such, here are ten trending data science tools that you should have in your repertoire in 2021.

    PyTorch

    PyTorch can be used for a variety of functions from building neural networks to decision trees due to the variety of extensible libraries including Scikit-Learn, making it easy to get on board. Importantly, the platform has gained substantial popularity and established community support that can be integral in solving usage problems. A key feature of Pytorch is its use of dynamic computational graphs, which state the order of computations defined by the model structure in a neural network for example.

    Scikit-learn

    Scikit-learn has been around for quite a while and is widely used by in-house data science teams. Thus it’s not surprising that it’s a platform for not only training and testing NLP models but also NLP and NLU workflows. In addition to working well with many of the libraries already mentioned such as NLTK, and other data science tools, it has its own extensive library of models. Many NLP and NLU projects involve classic workflows of feature extraction, training, testing, model fit, and evaluation, meaning scikit-learn’s pipeline module fits this purpose well. 

    CatBoost

    Gradient boosting is a powerful machine-learning technique that achieves state-of-the-art results in a variety of practical tasks. For a number of years, it has remained the primary method for learning problems with heterogeneous features, noisy data, and complex dependencies: web search, recommendation systems, weather forecasting, and many others. CatBoost is a popular open-source gradient boosting library with a whole set of advantages, such as being able to incorporate categorical features in your data (like music genre or city) with no additional preprocessing.

    Auto-Sklearn

    AutoML automatically finds well-performing machine learning pipelines which allow data scientists to focus their efforts on other tasks, reducing the barrier to broadly apply machine learning and makes it available for everyone. Auto-Sklearn frees a machine learning user from algorithm selection and hyperparameter tuning, allowing them to use other data science tools. It leverages recent advantages in Bayesian optimization, meta-learning, and ensemble construction.

    Neo4J

    As data becomes increasingly interconnected and systems increasingly sophisticated, it’s essential to make use of the rich and evolving relationships within our data. Graphs are uniquely suited to this task because they are, very simply, a mathematical representation of a network. Neo4J is a native graph database platform, built from the ground up to leverage not only data but also data relationships.

    Tensorflow

    This Google-developed framework excels where many other libraries don’t, such as with its scalable nature designed for production deployment. Tensorflow is often used for solving deep learning problems and for training and evaluating processes up to the model deployment. Apart from machine learning purposes, TensorFlow can be also used for building simulations, based on partial derivative equations. That’s why it is considered to be an all-purpose and one of the more popular data science tools for machine learning engineers.

    Airflow

    Apache Airflow is a data science tool created by the Apache community to programmatically author, schedule, and monitor workflows. The biggest advantage of Airflow is the fact that it does not limit the scope of pipelines. Airflow can be used for building machine learning models, transferring data, or managing the infrastructure. The most important thing about Airflow is the fact that it is an “orchestrator.” Airflow does not process data on its own, Airflow only tells others what has to be done and when.

    Kubernetes

    Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Originally developed by Google, Kubernetes progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.

    Pandas

    Pandas is a popular data analysis library built on top of the Python programming language, and getting started with Pandas is an easy task. It assists with common manipulations for data cleaning, joining, sorting, filtering, deduping, and more. First released in 2009, pandas now sits as the epicenter of Python’s vast data science ecosystem and is an essential tool in the modern data analyst’s toolbox.

    GPT-3

    Generative Pre-trained Transformer 3 (GPT-3) is a language model that uses deep learning to produce human-like text. GPT-3 is the most recent language model coming from the OpenAI research lab team. They announced GPT-3 in a May 2020 research paper, “Language Models are Few-Shot Learners.” While a tool like this may not be something you use daily as an NLP professional, it’s still an interesting skill to have. Being able to spit out human-like text, answer questions, and even create code, it’s a fun factoid to have.

    Author: Alex Landa

    Source: Open Data Science

  • Finding the right monitoring tool: start asking questions!

    Finding the right monitoring tool: start asking questions!

    I spend a lot of my time at conferences and there’s one question I often hear with increasing frequency: Why are there so many monitoring tools?

    Looking around the vendor area at any conference, I see loads of monitoring tools and companies, and attendees are often overwhelmed by the sea of options. How can all of these tools be different and/or better than the other? The problem starts with the way we talk about monitoring.

    What does monitoring really mean?

    Monitoring is a pretty vague term: I can monitor my deployments, my application performance, and the cupcakes baking in the oven. It’s more important to ask ourselves, 'What problem are we trying to solve through monitoring?' I can buy a fire extinguisher because my cupcakes keep burning, or I can buy an oven timer so I remember to take them out of the oven. Simply put: there are a lot of monitoring tools because they solve a wide variety of problems. To differentiate and find value in the tools available, we have to know what problem we’re trying to solve. A lot of us have to address downtime in our systems, so we turn to monitoring solutions after some catastrophic failure. Others turn to monitoring to solve the tricky issue of resource allocation: there are entire consultancies built around which services use the most resources and how we can optimize them. Others use monitoring data to forecast sales in order to better measure the accuracy and success of their systems.

    Monitoring is so important because businesses and technology both survive on the premise that our services are available for as much of the time as possible. System uptime means community engagement and sales and relationships being built. There’s an abundance of monitoring tools because different systems and types of data require different approaches. Let’s talk about how you can start pruning the landscape.

    How do I find the right monitoring tool?

    To find the monitoring tool that fits your needs, here are a few questions to ask yourself:

    What type of data do you want to monitor? Are you gathering metrics, events, logs, user data, a combination of these, or something else entirely? Different types of data have different requirements for how we collect, store and analyze them. Look for tools made specifically for your type of data so you spend less time on setup.

    What is the source of your data? Is your data being generated by IoT sensors, web browsers, AWS or local servers? When choosing a monitoring tool, make sure it fits easily into the pipelines that already exist. Some tools are made specifically for the cloud while others are made for industrial IoT machinery. If you’re using a combination of data sources, you might have to think about a flexible collection agent.

    Do you have performance requirements? If you have limited resources or strict performance requirements, you can’t always instrument your applications with a full suite of monitoring tools. Find out how much work you have to do to optimize the tool. Look for benchmarks, user stories and hardware requirements.

    These are just a few questions to get started, and hopefully researching these will lead you to even more questions. Don’t be afraid to ask companies how their products are different. These are the questions that allow each monitoring company to show you its strengths. Here are a few questions you can start with:

    1. What type of data is the tool built for? Purpose-built solutions that match your data will be an easier transition.
    2. What ecosystem is the tool made to run in? Some tools are built specifically for a cloud-only environment while others are made for bare metal servers.
    3. Do you have an open source version or a free trial? There are a lot of pitfalls when it comes to implementing monitoring solutions, and being able to test them out before any money is involved is so much nicer.
    4. Bonus: I like to ask the person I’m talking to what their favorite feature of the product is. It can be surprising and enlightening.

    Get out there and start asking questions!

    Author: Katy Farmer

    Source: Insidebigdata 

  • Overview of MI Tools needed

    Overview of MI Tools needed!

    Organisations with access to the right market information at the right time have the power to transform themselves into successful market leaders while companies that don’t have access to accurate and current information are taking unnecessary risks that may undermine expansion and profitability. 

    Market intelligence is the process of acquiring, processing and analysing information about customers, competitors and markets. Market Intelligence as a systematic corporate activity is becoming increasingly commonplace not only in the largest global companies but also in small and middle-sized enterprises.

    While competition has grown, the media sources that provide critical business information have exploded exponentially. Information is being created at an alarming rate and the tools required to collect and analyse this information are becoming increasingly complex. 

    An overview of these tools and their funcitonality is missing. Companies that have the right tools to access this data are more likely to succeed. Tranlatemedia is one of the few companies that published a report containing some examples of data sources, intelligence tools and tips that should allow businesses to gain greater insights into their target markets thereby increasing the likelihood of global success.

    www.translatemedia.com

  • The 10 Commandments of Business Intelligence in Big Data

    shutterstock 10commandments styleuneed.de -200x120Organizations today don’t use previous generation architectures to store their big data. Why would they use previous-generation BI tools for big data analysis? When looking at BI tools for your organization, there are 10 “Commandments” you should live by.

    First Commandment: Thou Shalt Not Move Big Data
    Moving Big Data is expensive: it is big, after all, so physics is against you if you need to load it up and move it. Avoid extracting data out into data marts and cubes, because “extract” means moving, and creates big-data-sized problems in maintenance, network performance additional CPU — on two copies that are logically the same. Pushing BI down to the lower layers to run at the data is what motivated Big Data in the first place.

    Second Commandment: Thou Shalt Not Steal!...Or Violate Corporate Security Policy
    Security’s not optional. The sadly regular drumbeat of data breaches shows it’s not easy, either. Look for BI tools that can leverage the security model that’s already in place. Big Data can make this easier, with unified security systems like Ranger, Sentry and Knox; even Mongo has an amazing security architecture now. All these models allow you to plug right in, propagate user information all the way up to the application layer, and enforce a visualization’s authorization and the data lineage associated with it along the way. Security as a service: use it.

    Third Commandment: Thou Shalt Not Pay for Each User, Nor Every Gigabyte
    One of the fundamental beauties of Big Data is that when done right, it can be extremely cost effective. Putting five petabytes of data into Oracle could break the bank; but you can do just that in a big data system. That said, there are certain price traps you should watch out for before you buy. Some BI applications charge users by the gigabyte, or by gigabyte indexed. Caveat emptor! It’s totally common to have geometric, exponential, logarithmic growth in data and in adoption with big data. Our customers have seen deployments grow from tens of billions of entries to hundreds of billions in a matter of months, with a user base up by 50x. That’s another beauty of big data systems: Incremental scalability. Make sure you don’t get lowballed into a BI tool that penalizes your upside.

    Fourth Commandment: Thou Shalt Covet Thy Neighbor’s VisualizationsSharing static charts and graphs? We’ve all done it: Publishing PDFs, exporting to PNGs, email attachments, etc. But with big data and BI, static won’t cut it: All you have is pretty pictures. You should be able let anyone you want interact with your data. Think of visualizations as interactive roadmaps for navigating data; why should only one person take the journey? Publishing interactive visualizations is only the first step. Look ahead to the Github model. Rather than “Here’s your final published product,” get “Here is a Viz, make a clone, fork it, and this is how I derived at those insights, and see what other problem domains it applies to.” It lets others learn from your insights.

    Fifth Commandment: Thou Shalt Analyze Thy Data In Its Natural Form
    Too often, I hear people referring to big data as “unstructured.” It’s far more. Finance and sensors generate tons of key value pairs. JSON — probably the trendiest data format of all — can be semi-structured, multi-structured, etc. MongoDB has made a huge bet on making sure data should stay in this format: Beyond its virtues for performance and scalability reasons, expressiveness gets lost when you convert it into the rows and tables. And lots of big data is still created in tables, often with thousands of columns. And you’re going to have to do relational joins over all of it: “Select this from there when that...” Flattening can destroy critical relationships expressed in the original structure. Stay away from BI solutions that tell you “please transform your data into a pretty table because that’s the way we’ve always done it.”

    Sixth Commandment: Thou Shalt Not Wait Endlessly For Thine ResultsIn 2016 we expect things to be fast. One classic approach is OLAP cubes, essentially moving the data into a pre-computed cache, to get good performance. The problem is you have to extract and move data to build the cube before you get performance (see Commandment #1). Now, this can work pretty well at a certain scale... until the temp table becomes gigantic and crashes your laptop by trying to materialize it locally. New data will stop analysis in its tracks while you extract that data to rebuild the cache. Be wary of sampling too, you may end up building a visualization that looks great and performs well before you realize it’s all wrong because you didn’t have the whole picture. Instead, look for BI tools that make it easy to continuously change which data you are looking at.

    Seventh Commandment: Thou Shalt Not Build Reports, But Apps Instead
    For too long, ‘getting the data’ meant getting a report. In big data, BI users want asynchronous data from multiple sources so they don’t need to refresh anything — just like anything else that runs in browsers and on mobile devices. Users want to interact with the visual elements to get the answers they’re looking for, not just cross-filtering the results you already gave them. Frameworks like Rails made it easier to build Web applications. Why not do the same with BI apps? No good reason not to take a similar approach to these apps, APIs, templates, reusability, and so on. It’s time to look at BI through the lens of modern web application development.

    Eighth Commandment: Thou Shalt Use Intelligent ToolsBI tools have proven themselves when it comes to recommending visualizations based on data. Now it’s time to do the same for automatic maintenance of models and caching, so your end user doesn’t have to worry about it. At big data scale, it’s almost impossible to live without it, there’s a wealth of information that can be gleaned from how users interact with the data and visuals, which modern tools should use to leverage the data network effects . Also, look for tools that have search built in for everything, because I’ve seen customers who literally have thousands of visualizations they’ve built out. You need a way to quickly look for results, and with the web we’ve been trained to search instead of digging through menus.

    Ninth Commandment: Thou Shalt Go Beyond The Basics
    Today’s big data systems are known for predictive analytical horsepower. Correlation, forecasting, and more, all make advanced analytics more accessible than ever to business users. Delivering visualizations that can crank through big data without requiring programming experience empowers analysts and gets beyond a simple fixation on ‘up and to the right.’ To realize its true potential, big data shouldn’t have to rely on everyone becoming an R programmer. Humans are quite good at dealing with visual information; we just have to work harder to deliver it to them that way.

    Tenth Commandment: Thou Shalt Not Just Stand There On the Shore of the Data Lake Waiting for a Data Scientist To Do the WorkWhether you approach Big Data as a data lake or an enterprise data hub, Hadoop has changed the speed and cost of data and we’re all helping to create more of it every day. But when it comes to actually using big data for business users, it is too often a write-only system: Data created by the many is only used by the few.

    Business users have a ton of questions that can be answered with data in Hadoop. Business Intelligence is about building applications that deliver that data visually, in the context of day-to-day decision making. The bottom line is that everyone in an organization wants to make data-driven decisions. It would be a terrible shame to limit all the questions that big data can answer to those that need a data scientist to tackle them.

     Source: Datanami

EasyTagCloud v2.8