Step-by-Step Guide Get Any US Stock Price Data with Python and Alpaca Markets
The US Stock Market is worth over $50 trillion! I’ll show you how to retrieve data for any stock so you can autotrade it like a pro.
The US Stock Market is worth over $50 trillion! I’ll show you how to retrieve data for any stock so you can autotrade it like a pro.
The US Stock Market is the largest stock market in the world. Worth $50.8 trillion dollars on the 1st January 2024, it remains one of the fastest growing wealth creation mechanisms in world.
The size and complexity of the market is enormous. There are multiple exchanges (i.e. NASDAQ, NYSE) and over 10,000 different companies you could analyze.
Analyzing a market of this size is a perfect opportunity to use automation. In no time at all (or maybe 30 minutes), I’ll show you how to retrieve the price data for any stock or combination of stocks on the market — a perfect setup for the rest of this series.
So lets jump in.
In this episode, I’ll show you everything you need to retrieve historical market data for any US based stock. You’ll be using a combination of:
The best way to complete this episode is to have a dev environment set up and ready to go. I’ll be using the dev environment I created in a previous episode.
I’m passionate about effective learning environments, so this episode includes some wonderful learning resources you can access.
DYOR. Note that all trading is at your own risk. My goal is to provide you with the self-developed methods, systems, and tools I use — it is up to you to figure out if this solution works for you AND if I’ve provided credible content. Always DYOR (Do Your Own Research)
Referrals. I receive no commissions for any of the products I mention in this blog. They’re all free (or have a free tier), and I simply provide links to simplify your learning experience.
AI Use. No AI was harmed in the creation of this blog. Some of the images are partially generated or enhanced through AI tools, we always use a real human to put them together. I do not use AI to generate the text, just spell check.
Alpaca Markets advertises themselves as an API driven platform for trading the stock market. As a long-time user of the platform, I’m always impressed with the ease of use of the platform, the power of their API, and the ongoing addition of various markets to the platform.
Their documentation is pretty good, and you can trade Stocks, Crypto, and Forex.
To sign up for Alpaca Markets, follow this link.
The purpose of this series to build a trading bot. Therefore, we want to make our code as simple as possible.
One of the best ways to do this is to separate out the various pieces of functionality required. That way, we can refer back to them as and when we need to. It also allows us to comply with a powerful development principle called DRY — Don’t Repeat Yourself.
To do this, head to your trading bot dev environment and create a file called alpaca_interactions.py
. This file will be where we handle any interactions with the Alpaca Markets API.
Here’s what your file system should look like with this file added:
Authentication to Alpaca Markets is handled through the use of an API Key and an API Secret. You receive a different key pair for each Alpaca Markets account you use.
In this series, we’ll be using Paper Trading for all the examples. This ensures we don’t lose any money while we’re experimenting.
To get your keys, follow these steps:
Note. Make sure you record your API secret key immediately (and in a safe place!), as Alpaca Markets keeps no record of the secret key if you lose it.
As with all things related to secrets in coding, it’s super critical to keep your keys secure. Losing them can be an absolute pain, and if they get stolen and misused…well, I’m sure you can imagine the hassle!
Fortunately, the GitHub Codespace we’re using for our dev environment has some neat secret storage features freely available to us. They’re called GitHub Encrypted Secrets, and you can read more about them here.
For our purposes, this will allow us to store the secrets in a way that’s both really secure AND easy for you (and only you) to access.
Here’s how to add them:
At this point, your Codespace should notify you that a new secred has been added. Choose the option to reload your environment.
It’s time to get some stock data!
We’ll start by enabling your trading bot to access your API Key and API Secret Key you just created. Add the following code to your alpaca_interactions.py
file:
import os
import requests
# Set the Alpaca.Markets API Key
API_KEY = os.getenv('ALPACA_API')
API_SECRET = os.getenv('ALPACA_SECRET_API')
This code imports the OS library to your trading bot, then sets two variables, one for each of your keys.
The requests
library is used later.
In this trading bot, we’ll be using the raw HTTP endpoint for Alpaca Markets, rather than transiting through their Python SDK. The reason for this is that the Python SDK is almost impossibly complex and not very well documented. In contrast, the raw HTTP endpoint is much simpler to use and is well documented. Over the years, this has been a much more effective API for me.
Here’s what you need to do:
To do this, add the code below to your alpaca_interactions.py
file:
# Base function for querying the Alpaca.Markets API
def query_alpaca_api(url: str, params: dict) -> dict:
"""
Base function for querying the Alpaca.Markets API
:param url: The URL to query
:param params: The parameters to pass to the API
"""
# Check that the API Key and Secret are not None
if API_KEY is None:
raise ValueError("The API Key is not set.")
if API_SECRET is None:
raise ValueError("The API Secret is not set.")
# Set the header information
headers = {
'accept': 'application/json',
'APCA-API-KEY-ID': API_KEY,
'APCA-API-SECRET-KEY': API_SECRET
}
try:
# Get the response from the API endpoint
response = requests.get(url, headers=headers, params=params)
except Exception as exception:
print(f"An exception occurred when querying the URL {url} with the parameters {params}: {exception}")
raise exception
# Get the response code
response_code = response.status_code
# If the response code is 403, print that the API key and or secret are incorrect
if response_code == 403:
print("The API key and or secret are incorrect.")
raise ValueError("The API key and or secret are incorrect.")
# Convert the response to JSON
json_response = response.json()
# Return the JSON response
return json_response
Finally, head to your requirements.txt
and add a line called requests
to the bottom.
Nice work!
Okies. Right now we can successfully retrieve historical pricing data from Alpaca Markets.
However, if you were to look at the incoming data, you’d quickly find it’s not very easy to read. Furthermore, there’s a few ways we can modify our data so that it’s much easier to convert into strategies further on.
To do this, we’ll be leveraging a famous Python Library called Pandas. This library is one of the truly great Python libraries, and in truly Python style, it’s completely free.
Python is pretty amazing. It’s extremely fast, very flexible, and used throughout the data analysis world. I’m yet to find something data related that I can’t interface with.
First things first, we’ll need to import the Pandas library to our trading bot. To do this, return to your requirements.txt
file and add the Pandas library.
At this point, your requirements.txt
file should look like this:
requests
pandas
Now, reload your dev environment by running the command pip install -r requirements.txt
in your terminal (or bash).
As I mentioned before, all interaction with Alpaca Markets is through our alpaca_interactions.py
file. Therefore, we need to update this file to include the API query functionality.
To do this, start by adding these two lines to the TOP of your alpaca_interactions.py
file:
import pandas
import datetime
Next, add the below code to the BOTTOM of this file:
# Function to retrieve historical candlestick data from Alpaca.Markets
def get_historic_bars(symbols: list, timeframe: str, limit: int, start_date: datetime, end_date: datetime) -> pandas.DataFrame:
"""
Function to retrieve historical candlestick data from Alpaca.Markets
:param symbols: The symbols to retrieve the historical data for
:param timeframe: The timeframe to retrieve the historical data for
:param limit: The number of bars to retrieve
:param start_date: The start date for the historical data
:param end_date: The end date for the historical data
"""
# Check that the start_date and end_date are datetime objects
if not isinstance(start_date, datetime.datetime):
raise ValueError("The start_date must be a datetime object.")
if not isinstance(end_date, datetime.datetime):
raise ValueError("The end_date must be a datetime object.")
# Check that the end date is not in the future
if end_date > datetime.datetime.now():
print("The end date is in the future. Setting the end date to now.")
end_date = datetime.datetime.now()
# Check that the start date is not after the end date
if start_date > end_date:
raise ValueError("The start date cannot be after the end date.")
# Convert the symbols list to a comma-separated string
symbols_joined = ",".join(symbols)
# Set the start and end dates to the correct format - they should only include days
start_date = start_date.strftime("%Y-%m-%d")
end_date = end_date.strftime("%Y-%m-%d")
# Create the params dictionary
params = {
"symbols": symbols_joined,
"timeframe": timeframe,
"limit": limit,
"start": start_date,
"end": end_date,
"adjustment": "raw",
"feed": "sip",
"sort": "asc"
}
# Set the API endpoint
url = f"https://data.alpaca.markets/v2/stocks/bars"
# Send to the base function to query the API
try:
json_response = query_alpaca_api(url, params)
except Exception as exception:
print(f"An exception occurred in the function get_historic_bars() with the parameters {params}: {exception}")
raise exception
# Extract the bars from the JSON response
json_response = json_response["bars"]
# Create an empty parent dataframe
bars_df = pandas.DataFrame()
# Iterate through the symbols list
for symbol in symbols:
# Extract the bars for the symbol
symbol_bars = json_response[symbol]
# Convert the bars to a dataframe
symbol_bars_df = pandas.DataFrame(symbol_bars)
# Add the symbol column
symbol_bars_df["symbol"] = symbol
# Modify the following column names to be more descriptive:
# o -> candle_open
# h -> candle_high
# l -> candle_low
# c -> candle_close
# v -> candle_volume
# t -> candle_timestamp
# vw -> vwap
# Rename the columns
symbol_bars_df = symbol_bars_df.rename(
columns={
"o": "candle_open",
"h": "candle_high",
"l": "candle_low",
"c": "candle_close",
"v": "candle_volume",
"t": "candle_timestamp",
"vw": "vwap"
}
)
# Add the symbol bars to the parent dataframe
bars_df = pandas.concat([bars_df, symbol_bars_df])
# Return the historical bars
return bars_df
This code is pretty powerful, and it does a lot. Here’s an overview:
Note. As a side note, the data format we convert this API into is a data format that is consistent across my entire trading bot series. So even if you look at my series about Polygon, Binance, and so on, you’ll find they’re exactly the same!
Our trading bot is now ready to retrieve historic candlestick data from Alpaca Markets!
I’ll show you how.
Depending on if you’ve completed some of my other content, you may or may not have already have an app.py
setup in your dev environment.
If you don’t go ahead and create that now.
Add the following code to app.py
import alpaca_interactions as alpaca
import datetime
# List of symbols
symbols = ["AAPL"]
max_number_of_candles = 100
timeframe = "1day"
# Function to run the trading bot
def auto_run_trading_bot():
"""
Function to run the trading bot
"""
# Print Welcome to your very own trading bot
print("Welcome to your very own trading bot")
# Set the end date to yesterday
end_date = datetime.datetime.now() - datetime.timedelta(days=1) # Note that if you have a premium subscription you can remove this restriction
# Set the start date to one year ago
start_date = end_date - datetime.timedelta(days=365)
# Get the historical data
for symbol in symbols:
# Convert symbol to a list
symbol = [symbol]
# Get the historical data
historical_data = alpaca.get_historic_bars(
symbols=symbol,
timeframe=timeframe,
start_date=start_date,
end_date=end_date,
limit=max_number_of_candles
)
# Print the historical data
print(historical_data) # <- this can be removed if you don't want to see the progression
# Main function for program
if __name__ == "__main__":
auto_run_trading_bot()
Here, we’ve performed the following steps:
alpaca_interactions.py
library that we’ve been building throughout this episodeIf you go to your terminal and run app.py
you should be some results.
Here’s what I got (note your actual AAPL data will be different as you’re running it at a different day than I am):
To demonstrate the power of an algorithmic trading bot, let’s update some of our input parameters.
Symbols. For instance, let’s say you wanted to retrieve the daily data for the Facebook (now known as META), Apple, Amazon, Netflix, and Google (now known as Alphabet) tickers. This is known as the original FAANG group of companies.
To do this, all you need to do is alter your symbols
variable to look like this:
symbols = ["AAPL", "GOOGL", "META", "NFLX", "AMZN"]
Timeframe. To retrieve different timeframes, update your timeframe
variable. For instance:
timeframe = '30min'
More candlesticks. Update your max_number_of_candles
variable. For instance:
max_number_of_candles = 1000
With a minimum amount of work, you can drastically increase the power of your trading bot!
You’ve got everything you need to retrieve data for any stock on the US Stock Markets. This set’s the foundation for building an incredibly powerful stock market trading bot.
Follow my blog to see the various types of trading bots you can design and build, such as:
I love hearing from my readers, so feel free to reach out. It means a ton to me when you clap for my articles or drop a friendly comment — it helps me know that my content is helping.
❤
import os
import requests
import pandas
import datetime
# Set the Alpaca.Markets API Key
API_KEY = os.getenv('ALPACA_API')
API_SECRET = os.getenv('ALPACA_SECRET_API')
# Base function for querying the Alpaca.Markets API
def query_alpaca_api(url: str, params: dict) -> dict:
"""
Base function for querying the Alpaca.Markets API
:param url: The URL to query
:param params: The parameters to pass to the API
"""
# Check that the API Key and Secret are not None
if API_KEY is None:
raise ValueError("The API Key is not set.")
if API_SECRET is None:
raise ValueError("The API Secret is not set.")
# Set the header information
headers = {
'accept': 'application/json',
'APCA-API-KEY-ID': API_KEY,
'APCA-API-SECRET-KEY': API_SECRET
}
try:
# Get the response from the API endpoint
response = requests.get(url, headers=headers, params=params)
except Exception as exception:
print(f"An exception occurred when querying the URL {url} with the parameters {params}: {exception}")
raise exception
# Get the response code
response_code = response.status_code
# If the response code is 403, print that the API key and or secret are incorrect
if response_code == 403:
print("The API key and or secret are incorrect.")
raise ValueError("The API key and or secret are incorrect.")
# Convert the response to JSON
json_response = response.json()
# Return the JSON response
return json_response
# Function to retrieve historical candlestick data from Alpaca.Markets
def get_historic_bars(symbols: list, timeframe: str, limit: int, start_date: datetime, end_date: datetime) -> pandas.DataFrame:
"""
Function to retrieve historical candlestick data from Alpaca.Markets
:param symbols: The symbols to retrieve the historical data for
:param timeframe: The timeframe to retrieve the historical data for
:param limit: The number of bars to retrieve
:param start_date: The start date for the historical data
:param end_date: The end date for the historical data
"""
# Check that the start_date and end_date are datetime objects
if not isinstance(start_date, datetime.datetime):
raise ValueError("The start_date must be a datetime object.")
if not isinstance(end_date, datetime.datetime):
raise ValueError("The end_date must be a datetime object.")
# Check that the end date is not in the future
if end_date > datetime.datetime.now():
print("The end date is in the future. Setting the end date to now.")
end_date = datetime.datetime.now()
# Check that the start date is not after the end date
if start_date > end_date:
raise ValueError("The start date cannot be after the end date.")
# Convert the symbols list to a comma-separated string
symbols_joined = ",".join(symbols)
# Set the start and end dates to the correct format - they should only include days
start_date = start_date.strftime("%Y-%m-%d")
end_date = end_date.strftime("%Y-%m-%d")
# Create the params dictionary
params = {
"symbols": symbols_joined,
"timeframe": timeframe,
"limit": limit,
"start": start_date,
"end": end_date,
"adjustment": "raw",
"feed": "sip",
"sort": "asc"
}
# Set the API endpoint
url = f"https://data.alpaca.markets/v2/stocks/bars"
# Send to the base function to query the API
try:
json_response = query_alpaca_api(url, params)
except Exception as exception:
print(f"An exception occurred in the function get_historic_bars() with the parameters {params}: {exception}")
raise exception
# Extract the bars from the JSON response
json_response = json_response["bars"]
# Create an empty parent dataframe
bars_df = pandas.DataFrame()
# Iterate through the symbols list
for symbol in symbols:
# Extract the bars for the symbol
symbol_bars = json_response[symbol]
# Convert the bars to a dataframe
symbol_bars_df = pandas.DataFrame(symbol_bars)
# Add the symbol column
symbol_bars_df["symbol"] = symbol
# Modify the following column names to be more descriptive:
# o -> candle_open
# h -> candle_high
# l -> candle_low
# c -> candle_close
# v -> candle_volume
# t -> candle_timestamp
# vw -> vwap
# Rename the columns
symbol_bars_df = symbol_bars_df.rename(
columns={
"o": "candle_open",
"h": "candle_high",
"l": "candle_low",
"c": "candle_close",
"v": "candle_volume",
"t": "candle_timestamp",
"vw": "vwap"
}
)
# Add the symbol bars to the parent dataframe
bars_df = pandas.concat([bars_df, symbol_bars_df])
# Return the historical bars
return bars_df