In February 2001, Doug Laney published 3-D Data Management: Controlling Data Volume, Velocity and Variety in which he defined processes for how to recognize and, more importantly, deal with the increasing challenges of three-dimensional data (big data) volume, velocity, and variety.
In Laney’s article, he was the first to write about organizations collecting data from a variety of data sources, such as internal business transactions, website eCommerce transactions, embedded forms, IoT devices, industrial equipment, videos, social media, and business SaaS (News - Alert) applications (e.g., marketing automation platforms, CRMs), and other sources.
Until this time, large amounts of data collected ran into issues, both in performance and storage space, but new data-storage warehouses were making larger data sets and the processing of this data possible.
The Introduction of Big Data
Now known as big data, the term was coined to define a volume of structured and unstructured data so large it could not be processed—or could not be easily processed—with what were then traditional data-handling methods. Though the term is important, more important is how organizations have come to use and analyze big data to gain insights that lead to better business decisions.
Perhaps the most visible example of big data is Google (News - Alert) Analytics, which is used by 30 – 50 million websites. Not only did Google create an application that provided the company with insights into its own customer base, Google directly, and through third-party vendors, enabled customers to use the same data for insights into their businesses.
Big data on its own does not tell the story, it’s what you can do with big data that puts businesses on the edge of their seat in anticipation. Real enlightenment begins when organizations are able to integrate the data they are collecting with their own data sets to reduce costs and time, optimize product development, create or expand eCommerce stores, and other uses spanning countless industries.
When businesses use data from a third party, it’s called alternative data (i.e., data from an alternative source). Businesses don’t have to collect their own big data to be able to implement alternative data within their organization. Sources already exist and can benefit businesses both large and small.
Collecting and Integrating Alternative Data
With the increasing demand for alternative data, an entire industry has sprung up around the collection and integration. The types of services these companies provide are known as web data scraping (gathering data from other websites) and web data integrating (integrating data from other websites into existing data). Some companies provide both services.
Web data scraping or web data extraction is a process in which data is collected from a website but the receiving business is responsible for developing internal processes to cleanse, distill, and integrate the data into existing workflows.
In web data integration, not only is the data collected, but APIs are used to provide integration between the collected data and their clients’ data, which often includes ongoing data additions, changes, and deletions, as well as maintenance of the data on a regular basis.
Web data integration goes beyond coding a custom web scraper to collect datasets, which may be inaccurate or incomplete. Integrators help the business identify web data that will provide value to the organization, extract that data completely and accurately, and present the data in a consumable format—cleansed and prepared for analysis. Sentiment analysis, product and pricing information, and market intelligence are obtained within minutes.
Large and small
Collecting alternative data is not always a big job, sometimes it’s as simple as displaying the day’s mortgage interest rate or perhaps the number of tickets remaining for a concert. The larger the data set, the higher the probability it contains errors, and the more likely the organization will need the services of a web data integrator. This is especially true when dealing with financial data.
In some cases, integrating the data from one website is not enough—the data may need to be collected from multiple sites and this introduces a higher level of potential inaccuracy.
Poor data merging makes for unreliable data, so web data integration companies provide added services to ensure the data is more valuable to the organization. The process consumes the data and displays it in a meaningful manner that provides the benefit of outside intelligence the business needs to make sound decisions about their operations.
With web data integration for sales and marketing, organizations can profile their ideal customer and use this information to generate leads, create data-driven content, monitor search engine rankings, and much more.
Web data scraping and integration are important to many industries and many functions within those industries and are often impacting business decisions in retail and manufacturing, risk management, equity research, travel and tourism, news, and academia organizations. It’s an easier task to list businesses that would not benefit from data sets.
Frequent website users likely encounter many sites every day that are integrating data collected from a third-party source: a travel website, a real estate website with property listings, a website that rents or sells contact lists, a financial website with a stock ticker-tape.
For businesses considering web data integration, the web is rife with examples of competitors already using web data within their organizations and these can be the source for endless ideas a business could use to guide internal implementation of such an important business strategy.
About the author: Luke Fitzpatrick covers blockchain trends on Forbes. He has been published in Yahoo! News, Influencive and Tech In Asia. He is a guest lecturer at the University of Sydney, lecturing in Cross-Cultural Management and the Pre-MBA Program.
Edited by Erik Linask