Tutorial
November 20, 2021

How to Build a Heatmap of the Stock Market – Part One: Portfolios, ETFs, and Prices

Learn how to acquire constituents and prices for a portfolio of stocks using Python, then use that data to build a heatmap visualization.

Cloud icon

Taro Kuriyama

https://iexcloud.io/community/blog/how-to-build-a-heatmap-of-the-stock-market-part-one-portfolios-etfs-and-prices
@iexcloud
https://iexcloud.io/community/blog/how-to-build-a-heatmap-of-the-stock-market-part-one-portfolios-etfs-and-prices
@iexcloud

Learn how to acquire constituents and prices for a portfolio of stocks using Python, then use that data to build a heatmap visualization.

Visualizing a Portfolio

The market is down 0.5%, but your portfolio is up 0.1%. Why? Which sectors and tickers drive the overall performance of the market and your portfolio?

A heatmap provides an information-dense visualization to answer such questions quickly. This technique uses color coding to show how individual components in a group perform relative to each other. In the world of investments, they’re a popular way to analyze how specific tickers perform within an index, ETF, or portfolio.

With the help of a heatmap, you can quickly identify underperformers or overperformers, all while getting a sense of how much individual components drive the performance of the overall portfolio.

A typical heatmap encodes data as follows:

Data SemanticsVisual EncodingData ExampleRelative sizeAreaPortfolio holdings by weightHierarchyPosition / Visual GroupingPortfolio sectors / subsectorsChangeColorPrice change from previous period

A heatmap that follows such an encoding scheme looks like this:

Treemap example

In this two-part series, we’ll show you how to build these visualizations and use them with ETF data. We’ll also discuss why ETF data can be a good choice for measuring market performance.

This first post will cover the fundamentals of how to acquire a portfolio of stocks with closing prices from IEX Cloud using Python. Market indices will be used as reference portfolios, with ETFs as proxies for such indices.

The second post will show how to transform the data into a heatmap using Elm, a functional programming language for the web.

Technical Notes

This article uses Python 3.9. Code is annotated with the typing standard library (and assumes the use of mypy for static analysis, though it is not required).

Standard Dependencies

  • pandas
  • requests

All standard dependencies can be installed with a package manager of choice, e.g. pip install requests.

Additional Dependencies

For brevity and clarity, helper functions are collected in local helper scripts utils.py, iex_cloud.py, and log_setup.py. All code, including the helper files, can be found here.

ETFs as Index Proxies

Suppose we want to use the S&P 500 index, benchmark against U.S. microcaps, or monitor the latest ESG index.

For many use cases, licensing the data from the index providers may be impractical (for instance due to costs or restrictive terms of usage).

An alternative is to use ETFs as proxies. ETFs closely track benchmark indices and, in most cases, the composition of stocks in the ETF holdings (or alternatively, the Portfolio Composition Files) are useful proxies for the actual index composition. (Some readers may wish to dig deeper into further nuances, as there can be discrepancies.)

ETF data is available from a variety of providers. For illustrative purposes, this article parses a holdings file made publicly available by the ETF issuer -- for the iShares Core S&P 500 ETF (IVV).

Fetching ETF Holdings

Since the data is in a loosely structured CSV format, some parsing is required to obtain the ETF holdings. The below function describes the successive steps:

  1. Fetch the raw text data.
  2. Convert the text data to a two-dimensional list of strings.
  3. Parse the fund reference data.
  4. Parse the fund holdings data.

The full parser is available in the previously noted additional dependencies link.

If the HTTP call and parsing succeed, the function returns a FundRef typed dict object, and a Pandas dataframe of the holdings.

Next, we pull some prices from IEX Cloud to join with the holdings data.

Light IEX Cloud API Wrapper

To illustrate usage of the IEX Cloud API in Python, we'll write some simple wrappers around the requests library. Note that IEX Cloud lists a number of official and unofficial client libraries.

Starting with the goal: we want to write simple functions like the below, which call API endpoint with the minimum options required.

Note that the api_token parameter expects a publishable key. The get function constructs a request URL string according to the API specification, calling the Production Stable environment by default and returning the response object.

The http_get() helper function wraps the standard requests.get function and adds logging for HTTP response status codes that are not 200.

Fetching Prices

One way to fetch prices is via the Previous endpoint defined earlier, which provides end-of-day data for the previous business day. For stock prices, the Quote and Intraday Prices endpoints are also available. For illustrative purposes, we parse the response into an Open, High, Low, Close, Volume (OHLCV) tuple, defined as a type alias.

The HTTP GET request may fail, and the response may also fail. In either case, log messages will be generated, and the call to get_prices() will return an empty OHLCV tuple with an error status and message.

Here's an example of a normal request and response:

Joining Data

Joining the holdings and prices data is straightforward as long as the symbology is consistent.

First, we filter the holdings to just stocks and a subset of columns:

We then consider that we need a function to add prices to holdings:

Testing the function:

Priced holding tuples are emitted, as expected. Mapping the pricing function over the entire holdings set returns a priced portfolio.

Note that the actual implementation may vary depending on the downstream processing required. For example, targeting a Pandas DataFrame output would also be common, as well as generating a flat list or tuple for database insertion.

Implementation Considerations

This post has dealt with data for a single business day, stored in memory. To develop a time series going forward, a recurring process will be required to fetch and persist the data (e.g. as flat files, or written to a database).

In implementing such a recurring process, perhaps joining some additional data from sources outside IEX Cloud, a variety of considerations arise. A few noteworthy ones include:

  • Symbology matching: Joining IEX Cloud data with other sources will require symbols to be matched on a regular basis
  • Stale data: Some data may be unexpectedly stale (e.g. a stale price, if a stock has suspended trading)
  • Job handling: What times and days are optimal for acquiring the data? What happens if jobs fail? What happens on holidays?

Wrapping Up

In this post, we've covered the basics of how to fetch ETF constituents (which can be used as proxies for indices) and join them with price data. The code, as well as a Jupyter notebook used to write this post, is available here.

In the next article, we'll see how the data can be served to an Elm application, which will generate an interactive heatmap for the web. See you there!

Part Two: Build and Interactive Heatmap

IEX Cloud Services LLC makes no promises or guarantees herein regarding results from particular products and services, and neither the information, nor any opinion expressed here, constitutes a solicitation or offer to buy or sell any securities or provide any investment advice or service. Any views or opinions expressed here in this blog do not represent the official opinions of IEX Group or IEX Cloud.