• December 22, 2024

Alternative Data

Get Started - AlternativeData.org

Get Started – AlternativeData.org

What is Alternative Data?
Alternative data refers to data used by investors to evaluate a company or investment that is not within their traditional data sources (financial statements, SEC filings, management presentations, press releases, etc. ). Alternative data helps investors get more accurate, faster, or more granular insights and metrics into company performance than traditional data sources. Over the last 10 years, increases in computing power and personal device usage created massive growth in data generation. As a direct outcome, a large number of companies emerged to collect, clean, analyze, and interpret data and provide it as a product that could inform investment decisions (“Alternative Data Providers”). See growth in alternative data providers selling to institutional investors in Figure 1.
Alternative Data Provider Stats
Alternative Data Providers: 445
Alternative Data Use Growth
For funds to make use of these datasets for investment decisions, they have had to build out their data teams.
The number of alternative data full-time employees (FTEs) at funds has grown ~450% in last 5 years.
Most alternative data FTEs have 11+ years experience and do not have graduate degrees.
Tech, Academia, and Data Providers are quickly becoming main channels for sourcing alternative data FTEs.
Cost of an alternative data team starts at $1. 5 – $2. 5m.
Figure 2. Growth in funds with alternative data teams and full-time alternative data employees.
See our original Buy-side Alternative Data Employee Analysis for detailed breakdown of growth in alternative data team building on the Buy-side. Note: Updated methodology on the analysis cited above led to new estimate of 1, 190 Data FTEs in 2017.
For most recent alternative data-related job posting at funds and providers, see the Jobs Page.
As funds have found use cases and applications for the increasing number of alternative datasets, their spend on alternative data has increased accordingly. (See Figure 3 – note: this includes spend on both datasets and infrastructure).
Figure 3. Buy-side spend on alternative datasets and infrastructure.
AlternativeData Stack
After thousands of conversations with investors, vendors, and experts, we have compiled the stack of top alternative data providers in the institutional investment space. The stack focuses on the top 100 data providers used by fundamental investors. It excludes market data, economic/macro data, and market news/industry publications.
Each provider’s position is intended to the firm’s product positioning relative to institutional investors. Data providers in the clusters towards the top are focused on data analysis and extracting insights from alternative data. Clusters that are positioned toward the bottom are more focused on data collection and quality assurance. and tend to not be directly consumed by fundamental analysts and PMs, but rather go through data brokers, the sell-side, or internal data teams for analysis.
For major players in alternative data providers broker out by data source and sector coverage, read on.
Major Types of Alternative Data
How is alternative data generated?
Individuals: Social/Sentiment, Web Traffic, App Usage, Survey
Business Processes: Credit/Debit Card, Web Data, Public Data, Email/Consumer Receipts
Sensors: Geo-location, Satellite, Weather
Data from business processes are typically more structured than data from individuals or sensors.
Data cost: typically Business Processes > Sensors > Individuals.
What are the different categories of alternative data?
App Usage – Data on app engagement and reviews. The level of data accuracy and usefulness depends on the app panel size, functions and features collected, and the level of user engagement. Popular use cases: gaming, food delivery, streaming services.
Credit/Debit Card – Transaction data generated from credit and debit cards. This data is considered highly accurate when the transaction panel is large and covers a consistent user sample. Usually panels over 3 million consumers are considered large enough to be useful. These panels are some of the more expensive data licenses on the market. Popular use cases: Retail revenue tracking.
Email/Consumer Receipts – Transaction data generated from email receipts. This data is accurate, but panels are typically smaller than credit/debit card panels and can be biased depending on the nature of the email receipt collection (often via an opt-in email or rewards app). Popular use cases: Retail revenue tracking.
Geo-location – Foot traffic data available from WiFi signals (limited granularity and accuracy) or bluetooth beacons (higher accuracy, more expensive, less coverage). Popular use cases: Geography-specific retail foot traffic tracking.
Public Data – Data from public resources. In its original form, this data is often difficult to access, not clean, not in a usable format (e. g. PDF). The value add of public data providers is the work of collecting, aggregating, and making the data actionable. Examples include SEC filings, patent data, government contracts, import/export data, etc. Popular use cases: patent data for tech company; supply chain imports for manufacturing; government contracts for construction company.
Satellite – Data collected from satellites or (increasingly common) low-level drones. This data is expensive and of variable quality. Image processing is as important as data collection (raw data is not valuable to most investment teams). Satellite data on parking lots is only useful if a more direct measurement of store activity (geo-location data) or spend (credit card, email receipt) data is not available or beyond price range. Popular use cases: supply chain disruption tracking; agriculture yields tracking; construction tracking; oil & gas production/storage.
Sell-side – Alternative data teams within large sell-side institutions. Combine new data and processing techniques with traditional sell-side research.
Social/Sentiment – Data obtained from text processing of social media, news, management communications, and other sources. Sentiment data is relevant for some companies (think younger, more trading volume, more volatile) more than large, established corporations. The data is often more relevant to shorter-term traders as it does not always reflect fundamental business aspects. On the lower end of cost spectrum. Popular use cases: Event-driven sentiment tracking; Brand Virality/Advertising success.
Survey – Data collected from surveys. This requires opt-in and panel diversity is variable depending on how good the provider is. This is a direct line in to consumer sentiment, rather than collecting it from text processing as in social/sentiment data. Popular use cases: brand preference; consumer behavior.
Weather – Data on weather patterns collected from sensors. Popular use cases: agriculture and commodities.
Web Data – Data scraped from public websites. This data comes in a wide range, from highly accurate and expensive to extremely raw and relatively inexpensive. This data is applicable where KPIs can be tracked by aggregating and analyzing large amounts of public-facing information, such as companies that publicize quantity sold and prices on each item page. This data can be extremely granular. Popular use cases: e-commerce; auto sales; airlines bookings; travel bookings; job postings.
Web Traffic – Data on quantity, demographics, and history (clickstream) of users visiting a certain website. This is popular for tracking e-commerce efforts. Popular use cases: travel bookings; e-commerce.
Other – There are many other popular datasets, including point-of-sale data, ad spend data, pricing data, and much more. These are not yet broad enough to capture a full section.
Which are the most popular datasets for investors?
Data source with the greatest number of providers:Social/Sentiment
Highest grossing data source: Credit/Debit Card
Most utilized datasets: Web Data, Credit/Debit Card
Most insightful datasets: Credit/Debit Card, Web Data
Least insightful datasets: Geo-location, Satellite
Major Players in Alternative Data
Get Started - AlternativeData.org

Get Started – AlternativeData.org

What is Alternative Data?
Alternative data refers to data used by investors to evaluate a company or investment that is not within their traditional data sources (financial statements, SEC filings, management presentations, press releases, etc. ). Alternative data helps investors get more accurate, faster, or more granular insights and metrics into company performance than traditional data sources. Over the last 10 years, increases in computing power and personal device usage created massive growth in data generation. As a direct outcome, a large number of companies emerged to collect, clean, analyze, and interpret data and provide it as a product that could inform investment decisions (“Alternative Data Providers”). See growth in alternative data providers selling to institutional investors in Figure 1.
Alternative Data Provider Stats
Alternative Data Providers: 445
Alternative Data Use Growth
For funds to make use of these datasets for investment decisions, they have had to build out their data teams.
The number of alternative data full-time employees (FTEs) at funds has grown ~450% in last 5 years.
Most alternative data FTEs have 11+ years experience and do not have graduate degrees.
Tech, Academia, and Data Providers are quickly becoming main channels for sourcing alternative data FTEs.
Cost of an alternative data team starts at $1. 5 – $2. 5m.
Figure 2. Growth in funds with alternative data teams and full-time alternative data employees.
See our original Buy-side Alternative Data Employee Analysis for detailed breakdown of growth in alternative data team building on the Buy-side. Note: Updated methodology on the analysis cited above led to new estimate of 1, 190 Data FTEs in 2017.
For most recent alternative data-related job posting at funds and providers, see the Jobs Page.
As funds have found use cases and applications for the increasing number of alternative datasets, their spend on alternative data has increased accordingly. (See Figure 3 – note: this includes spend on both datasets and infrastructure).
Figure 3. Buy-side spend on alternative datasets and infrastructure.
AlternativeData Stack
After thousands of conversations with investors, vendors, and experts, we have compiled the stack of top alternative data providers in the institutional investment space. The stack focuses on the top 100 data providers used by fundamental investors. It excludes market data, economic/macro data, and market news/industry publications.
Each provider’s position is intended to the firm’s product positioning relative to institutional investors. Data providers in the clusters towards the top are focused on data analysis and extracting insights from alternative data. Clusters that are positioned toward the bottom are more focused on data collection and quality assurance. and tend to not be directly consumed by fundamental analysts and PMs, but rather go through data brokers, the sell-side, or internal data teams for analysis.
For major players in alternative data providers broker out by data source and sector coverage, read on.
Major Types of Alternative Data
How is alternative data generated?
Individuals: Social/Sentiment, Web Traffic, App Usage, Survey
Business Processes: Credit/Debit Card, Web Data, Public Data, Email/Consumer Receipts
Sensors: Geo-location, Satellite, Weather
Data from business processes are typically more structured than data from individuals or sensors.
Data cost: typically Business Processes > Sensors > Individuals.
What are the different categories of alternative data?
App Usage – Data on app engagement and reviews. The level of data accuracy and usefulness depends on the app panel size, functions and features collected, and the level of user engagement. Popular use cases: gaming, food delivery, streaming services.
Credit/Debit Card – Transaction data generated from credit and debit cards. This data is considered highly accurate when the transaction panel is large and covers a consistent user sample. Usually panels over 3 million consumers are considered large enough to be useful. These panels are some of the more expensive data licenses on the market. Popular use cases: Retail revenue tracking.
Email/Consumer Receipts – Transaction data generated from email receipts. This data is accurate, but panels are typically smaller than credit/debit card panels and can be biased depending on the nature of the email receipt collection (often via an opt-in email or rewards app). Popular use cases: Retail revenue tracking.
Geo-location – Foot traffic data available from WiFi signals (limited granularity and accuracy) or bluetooth beacons (higher accuracy, more expensive, less coverage). Popular use cases: Geography-specific retail foot traffic tracking.
Public Data – Data from public resources. In its original form, this data is often difficult to access, not clean, not in a usable format (e. g. PDF). The value add of public data providers is the work of collecting, aggregating, and making the data actionable. Examples include SEC filings, patent data, government contracts, import/export data, etc. Popular use cases: patent data for tech company; supply chain imports for manufacturing; government contracts for construction company.
Satellite – Data collected from satellites or (increasingly common) low-level drones. This data is expensive and of variable quality. Image processing is as important as data collection (raw data is not valuable to most investment teams). Satellite data on parking lots is only useful if a more direct measurement of store activity (geo-location data) or spend (credit card, email receipt) data is not available or beyond price range. Popular use cases: supply chain disruption tracking; agriculture yields tracking; construction tracking; oil & gas production/storage.
Sell-side – Alternative data teams within large sell-side institutions. Combine new data and processing techniques with traditional sell-side research.
Social/Sentiment – Data obtained from text processing of social media, news, management communications, and other sources. Sentiment data is relevant for some companies (think younger, more trading volume, more volatile) more than large, established corporations. The data is often more relevant to shorter-term traders as it does not always reflect fundamental business aspects. On the lower end of cost spectrum. Popular use cases: Event-driven sentiment tracking; Brand Virality/Advertising success.
Survey – Data collected from surveys. This requires opt-in and panel diversity is variable depending on how good the provider is. This is a direct line in to consumer sentiment, rather than collecting it from text processing as in social/sentiment data. Popular use cases: brand preference; consumer behavior.
Weather – Data on weather patterns collected from sensors. Popular use cases: agriculture and commodities.
Web Data – Data scraped from public websites. This data comes in a wide range, from highly accurate and expensive to extremely raw and relatively inexpensive. This data is applicable where KPIs can be tracked by aggregating and analyzing large amounts of public-facing information, such as companies that publicize quantity sold and prices on each item page. This data can be extremely granular. Popular use cases: e-commerce; auto sales; airlines bookings; travel bookings; job postings.
Web Traffic – Data on quantity, demographics, and history (clickstream) of users visiting a certain website. This is popular for tracking e-commerce efforts. Popular use cases: travel bookings; e-commerce.
Other – There are many other popular datasets, including point-of-sale data, ad spend data, pricing data, and much more. These are not yet broad enough to capture a full section.
Which are the most popular datasets for investors?
Data source with the greatest number of providers:Social/Sentiment
Highest grossing data source: Credit/Debit Card
Most utilized datasets: Web Data, Credit/Debit Card
Most insightful datasets: Credit/Debit Card, Web Data
Least insightful datasets: Geo-location, Satellite
Major Players in Alternative Data
How to Collect Alternative Data - Chain of Demand

How to Collect Alternative Data – Chain of Demand

Alternative data has made its mark in many industries, serving as useful data sources that help investors make smarter, better business decisions. It is steadily overtaking conventional and traditional data sources as a go-to for investment choices. There are plenty of different types of alternative data out there, but what’s interesting to note is the various ways of collecting alternative data.
Ways to Collect Alternative Data
Using alternative data allows traders to analyze portfolios and funds at a granular level of detail, with information from a huge range of data categories and industries. By understanding how this is done, it can help investors in alternative data approach the space with a more informed idea.
Web Scraping
Web scraping is the general practice of extracting data from several websites on the internet. Scrapers or bots are mostly concerned with web pages and download relevant information, which is then processed through a collection of text processing functions. This information can then be extracted and transported in a spreadsheet or transformed into a form that can be very easy to understand. Web scrapers extract contacts and other details from a page. In marketing, web scraping is often used in lead generation, market analysis, price comparison, and competitive analysis.
Collecting Raw Data
Raw data is a collection of unstructured data in its original form but can be processed and used for greater insights. Sensors are one example of raw data that can be cleaned and used to gather market intelligence. Image processing is another important raw data that can be collected. The downside to collecting raw data is that there is a lot of time to be put into this. Oftentimes, raw data is not as valuable for investors.
Third-Party Licensing
Some companies can get licenses for collecting exhaust data. This is the data that is a by-product of a business process. Different companies can have different selling licensed exhaust data rates such as POS transactions, debit or credit card transaction details, etc. This data is then processed in a structured format and sold to various companies.
Challenges in Collection of Alternative Data
Collecting alternative data for investment is still a relatively new area to explore for businesses. For this reason, there are several challenges that a manager can face when trying to collect alternative data. Below are some common challenges in the collection of alternative data.
Non-Traditional Data Sources
Gathering logistics data that can quantify the shipping activities of a company is usually non-traditional. One problem that is observed while handling non-traditional data sources is the lack of expertise. If the company doesn’t have a department pro at collecting data from non-traditional sources, this might not be as useful as it can be for a company.
Collection of High-Quality Data
The collection of quality and valuable alternative data is perhaps the main challenge. There are various sources of alternative data out there, but knowing what’s useful and not can be the tricky part. One prevalent matter of collecting high-quality alternative data is figuring out its authenticity. One has to be vigilant about the source and legal accessibility.
Unstructured Data Sources
As mentioned above, one of the issues with getting alternative data is that many of it might come in unstructured form. Collecting unconventional data sources can only mean that the people who possess this data have not cleaned it properly. This means that there is a significant investment of time and resources to clean and process the data.
High Costs of Aggregated Transactions
When it comes to financial transaction data, there are often high licensing fees attached to them. Not only is the collection method are expensive, but they also require a lot of computing power. Each day, there are 2. 5 exabytes of data being generated, which requires a huge storage server, processing capacity, computing power, and analytical resources. And this is not even a fixed amount.
Privacy Concerns
As stated earlier, alternative data can pose concerns for data privacy. There are various things to account for when collecting alternative data sources. You need to understand as a business whether or not you are breaching privacy during the collection phase. Attributes of alternative data change over time and, therefore, can be tricky to maneuver around. If a consumer is not critical about the privacy concerns while dealing with the alternative data, it will ultimately bode well for the company or organization.
Final Thoughts
Overall, the collection of alternative data, if handled well, can be of great tremendous use for a business. However, it’s important to be realistic with your expectations and be wary of all the challenges of collecting alternative data sources. Not only can it be time-consuming and resource-intensive, but it is also a sensitive space to explore.
For this reason, companies rely on data-analytics firms such as Chain of Demand to provide help and ultimately make data easy to use. With hundreds of millions of data sources processed, Chain of Demand has a wide range of data covered. It can help retail investors and hedge fund managers make business decisions right.

Frequently Asked Questions about alternative data

What is alternative data?

What is Alternative Data? Alternative data refers to data used by investors to evaluate a company or investment that is not within their traditional data sources (financial statements, SEC filings, management presentations, press releases, etc.).

How do you collect alternative data?

Ways to Collect Alternative DataWeb Scraping. … Collecting Raw Data. … Third-Party Licensing. … Non-Traditional Data Sources. … Collection of High-Quality Data. … Unstructured Data Sources. … High Costs of Aggregated Transactions. … Privacy Concerns.May 26, 2021

What is alternative market data?

More than 400 companies are engaged in selling alternative data to hedge funds, thereby contributing significantly to market revenue. Alt-data refers to undiscovered data that is not within the traditional data sources, such as SEC filings, financial statements, press releases, and management presentations.

Leave a Reply