4 items tagged "Hadoop"

  • 10 Big Data Trends for 2017

    big-dataInfogix, a leader in helping companies provide end-to-end data analysis across the enterprise, today highlighted the top 10 data trends they foresee will be strategic for most organizations in 2017.
     
    “This year’s trends examine the evolving ways enterprises can realize better business value with big data and how improving business intelligence can help transform organization processes and the customer experience (CX),” said Sumit Nijhawan, CEO and President of Infogix. “Business executives are demanding better data management for compliance and increased confidence to steer the business, more rapid adoption of big data and innovative and transformative data analytic technologies.”
     
    The top 10 data trends for 2017 are assembled by a panel of Infogix senior executives. The key trends include:
     
    1.    The Proliferation of Big Data
        Proliferation of big data has made it crucial to analyze data quickly to gain valuable insight.
        Organizations must turn the terabytes of big data that is not being used, classified as dark data, into useable data.   
        Big data has not yet yielded the substantial results that organizations require to develop new insights for new, innovative offerings to derive a competitive advantage
     
    2.    The Use of Big Data to Improve CX
        Using big data to improve CX by moving from legacy to vendor systems, during M&A, and with core system upgrades.
        Analyzing data with self-service flexibility to quickly harness insights about leading trends, along with competitive insight into new customer acquisition growth opportunities.
        Using big data to better understand customers in order to improve top line revenue through cross-sell/upsell or remove risk of lost revenue by reducing churn.
     
    3.    Wider Adoption of Hadoop
        More and more organizations will be adopting Hadoop and other big data stores, in turn, vendors will rapidly introduce new, innovative Hadoop solutions.
        With Hadoop in place, organizations will be able to crunch large amounts of data using advanced analytics to find nuggets of valuable information for making profitable decisions.
     
    4.    Hello to Predictive Analytics
        Precisely predict future behaviors and events to improve profitability.
        Make a leap in improving fraud detection rapidly to minimize revenue risk exposure and improve operational excellence.
     
    5.    More Focus on Cloud-Based Data Analytics
        Moving data analytics to the cloud accelerates adoption of the latest capabilities to turn data into action.
        Cut costs in ongoing maintenance and operations by moving data analytics to the cloud.
     
    6.    The Move toward Informatics and the Ability to Identify the Value of Data
        Use informatics to help integrate the collection, analysis and visualization of complex data to derive revenue and efficiency value from that data
        Tap an underused resource – data – to increase business performance
     
    7.    Achieving Maximum Business Intelligence with Data Virtualization
        Data virtualization unlocks what is hidden within large data sets.
        Graphic data virtualization allows organizations to retrieve and manipulate data on the fly regardless of how the data is formatted or where it is located.
     
    8.    Convergence of IoT, the Cloud, Big Data, and Cybersecurity
        The convergence of data management technologies such as data quality, data preparation, data analytics, data integration and more.
        As we continue to become more reliant on smart devices, inter-connectivity and machine learning will become even more important to protect these assets from cyber security threats.
     
    9.    Improving Digital Channel Optimization and the Omnichannel Experience
        Delivering the balance of traditional channels with digital channels to connect with the customer in their preferred channel.
        Continuously looking for innovative ways to enhance CX across channels to achieve a competitive advantage.
     
    10.    Self-Service Data Preparation and Analytics to Improve Efficiency
        Self-service data preparation tools boost time to value enabling organizations to prepare data regardless of the type of data, whether structured, semi-structured or unstructured.
        Decreased reliance on development teams to massage the data by introducing more self-service capabilities to give power to the user and, in turn, improve operational efficiency.
     
    “Every year we see more data being generated than ever before and organizations across all industries struggle with its trustworthiness and quality. We believe the technology trends of cloud, predictive analysis and big data will not only help organizations deal with the vast amount of data, but help enterprises address today’s business challenges,” said Nijhawan. “However, before these trends lead to the next wave of business, it’s critical that organizations understand that the success is predicated upon data integrity.”
     
    Source: dzone.com, November 20, 2016
  • Big data vendors see the internet of things (IoT) opportunity, pivot tech and message to compete

    waterfall-stream-over-bouldersOpen source big data technologies like Hadoop have done much to begin the transformation of analytics. We're moving from expensive and specialist analytics teams towards an environment in which processes, workflows, and decision-making throughout an organisation can - in theory at least - become usefully data-driven. Established providers of analytics, BI and data warehouse technologies liberally sprinkle Hadoop, Spark and other cool project names throughout their products, delivering real advantages and real cost-savings, as well as grabbing some of the Hadoop glow for themselves. Startups, often closely associated with shepherding one of the newer open source projects, also compete for mindshare and custom.

    And the opportunity is big. Hortonworks, for example, has described the global big data market as a $50 billion opportunity. But that pales into insignificance next to what Hortonworks (again) describes as a $1.7 trillion opportunity. Other companies and analysts have their own numbers, which do differ, but the step-change is clear and significant. Hadoop, and the vendors gravitating to that community, mostly address 'data at rest'; data that has already been collected from some process or interaction or query. The bigger opportunity relates to 'data in motion,' and to the internet of things that will be responsible for generating so much of this.

    My latest report, Streaming Data From The Internet Of Things Will Be The Big Data World’s Bigger Second Act, explores some of the ways that big data vendors are acquiring new skills and new stories with which to chase this new opportunity.

    For CIOs embarking on their IoT journey, it may be time to take a fresh look at companies previously so easily dismissed as just 'doing the Hadoop thing.' 

    Source: Forrester.com, 

  • Hadoop engine benchmark: How Spark, Impala, Hive, and Presto compare

    forresters-hadoop-predictions-2015AtScale recently performed benchmark tests on the Hadoop engines Spark, Impala, Hive, and Presto. Find out the results, and discover which option might be best for your enterprise

    The global Hadoop market is expected to expand at an average compound annual growth rate (CAGR) of 26.3% between now and 2023, a testimony to how aggressively companies have been adopting this big data software framework for storing and processing the gargantuan files that characterize big data. But to turbo-charge this processing so that it performs faster, additional engine software is used in concert with Hadoop.

    AtScale, a business intelligence (BI) Hadoop solutions provider, periodically performs BI-on-Hadoop benchmarks that compare the performances of various Hadoop engines to determine which engine is best for which Hadoop processing scenario. The benchmark results assist systems professionals charged with managing big data operations as they make their engine choices for different types of Hadoop processing deployments.

    Recently, AtScale published a new survey that I discussed with Josh Klahr, AtScale's vice president of product management.

    "In this benchmark, we tested four different Hadoop engines," said Klahr. "The engines were Spark, Impala, Hive, and a newer entrant, Presto. We used the same cluster size for the benchmark that we had used in previous benchmarking."

    What AtScale found is that there was no clear engine winner in every case, but that some engines outperformed others depending on what the big data processing task involved. In one case, the benchmark looked at which Hadoop engine performed best when it came to processing large SQL data queries that involved big data joins.

    "There are companies out there that have six billion row tables that they have to join for a single SQL query," said Klahr. "The data architecture that these companies use include runtime filtering and pre-filtering of data based upon certain data specifications or parameters that end users input, and which also contribute to the processing load. In these cases, Spark and Impala performed very well. However, if it was a case of many concurrent users requiring access to the data, Presto processed more data."

    The AtScale benchmark also looked at which Hadoop engine had attained the greatest improvement in processing speed over the past six months.

    "The most noticeable gain that we saw was with Hive, especially in the process of performing SQL queries," said Klahr. "In the past six months, Hive has moved from release 1.4 to 2.1—and on an average, is now processing data 3.4 times faster."
     
    Other Hadoop engines also experienced processing performance gains over the past six months. Spark was processing data 2.4 times faster than it was six months ago, and Impala had improved processing over the past six months by 2.8%. In all cases, better processing speeds were being delivered to users.

    "What we found is that all four of these engines are well suited to the Hadoop environment and deliver excellent performance to end users, but that some engines perform in certain processing contexts better than others," said Klahr. "For instance, if your organization must support many concurrent users of your data, Presto and Impala perform best. However, if you are looking for the greatest amount of stability in your Hadoop processing engine, Hive is the best choice. And if you are faced with billions of rows of data that you must combine in complicated data joins for SQL queries in your big data environment, Spark is the best performer."

    Klahr said that many sites seems to be relatively savvy about Hadoop performance and engine options, but that a majority really hadn't done much benchmarking when it came to using SQL.

    "The best news for users is that all of these engines perform capably with Hadoop," sad Klahr. "Now that we also have benchmark information on SQL performance, this further enables sites to make the engine choices that best suit their Hadoop processing scenarios."

    Source: techrepublic.com, October 29, 2016

  • Hadoop: waarvoor dan?

    Hadoop

    Flexibel en schaalbaar managen van big data

    Data-infrastructuur is het belangrijkste orgaan voor het creëren en leveren van goede bedrijfsinzichten . Om te profiteren van de diversiteit aan data die voor handen zijn en om de data-architectuur te moderniseren, zetten veel organisaties Hadoop in. Een Hadoop-gebaseerde omgeving is flexibel en schaalbaar in het managen van big data. Wat is de impact van Hadoop? De Aberdeen Group onderzocht de impact van Hadoop op data, mensen en de performance van bedrijven.

    Nieuwe data uit verschillende bronnen

    Er moet veel data opgevangen, verplaatst, opgeslagen en gearchiveerd worden. Maar bedrijven krijgen nu inzichten vanuit verborgen data buiten de traditionele gestructureerde transactiegegevens. Denk hierbij aan: e-mails, social data, multimedia, GPS-informatie en sensor-informatie. Naast nieuwe databronnen hebben we ook een grote hoeveelheid nieuwe technologieën gekregen om al deze data te beheren en te benutten. Al deze informatie en technologieën zorgen voor een verschuiving binnen big data; van probleem naar kans.

    Wat zijn de voordelen van deze gele olifant (Hadoop)?

    Een grote voorloper van deze big data-kans is de data architectuur Hadoop. Uit dit onderzoek komt naar voren dat bedrijven die Hadoop gebruiken meer gedreven zijn om gebruik te maken van ongestructureerde en semigestructureerd data. Een andere belangrijke trend is dat de mindset van bedrijven verschuift, ze zien data als een strategische aanwinst en als een belangrijk onderdeel van de organisatie.

    De behoefte aan gebruikersbevoegdheid en gebruikerstevredenheid is een reden waarom bedrijven kiezen voor Hadoop. Daarnaast heeft een Hadoop-gebaseerde architectuur twee voordelen met betrekking tot eindgebruikers:

    1. Data-flexibiliteit – Alle data onder één dak, wat zorgt voor een hogere kwaliteit en usability.
    2. Data-elasticiteit – De architectuur is significant flexibeler in het toevoegen van nieuwe databronnen.

    Wat is de impact van Hadoop op uw organisatie?

    Wat kunt u nog meer met Hadoop en hoe kunt u deze data-architectuur het beste inzetten binnen uw databronnen? Lees in dit rapport hoe u nog meer tijd kunt besparen in het analyseren van data en uiteindelijk meer winst kunt behalen door het inzetten van Hadoop.

    Bron: Analyticstoday

EasyTagCloud v2.8