Skip to content

Issue 2: Consider Caching Binance API Responses #2

@PrinceSajjadHussain

Description

@PrinceSajjadHussain

Title: Implement caching for Binance API calls to reduce load and latency

Body:
The fetch_binance_ohlcv function in backtesting_tool.py makes direct calls to the Binance API every time the script is run. This can be inefficient, especially if the same data is requested repeatedly. It also risks hitting rate limits imposed by the Binance API.

To improve performance and robustness, consider implementing a caching mechanism. This could involve:

  1. In-Memory Caching: Using a dictionary to store API responses based on the request parameters (symbol, interval, limit). This is suitable for shorter-lived caches.
  2. File-Based Caching: Saving API responses to files (e.g., CSV or Parquet) on disk. This allows the cache to persist across script runs.
  3. Database Caching: Using a database (e.g., SQLite, PostgreSQL) to store API responses for more robust caching and potentially sharing the cache between different applications.

A simple implementation could look like this:

import requests
import pandas as pd
import os

CACHE_DIR = 'binance_cache'
os.makedirs(CACHE_DIR, exist_ok=True)

def fetch_binance_ohlcv(symbol, interval='1d', limit=365):
    cache_file = os.path.join(CACHE_DIR, f'{symbol}_{interval}_{limit}.parquet')
    if os.path.exists(cache_file):
        print(f"Loading from cache: {cache_file}")
        return pd.read_parquet(cache_file)

    base_url = "https://api.binance.us/api/v3/klines"
    params = {"symbol": symbol.upper(), "interval": interval, "limit": limit}
    response = requests.get(base_url, params=params)
    response.raise_for_status()
    data = response.json()
    df = pd.DataFrame(data, columns=[
        "timestamp", "open", "high", "low", "close", "volume",
        "close_time", "quote_asset_volume", "trades",
        "taker_buy_base", "taker_buy_quote", "ignore"
    ])
    df["timestamp"] = pd.to_datetime(df["timestamp"], unit='ms')
    df.set_index("timestamp", inplace=True)
    for col in ["open", "high", "low", "close", "volume"]:
        df[col] = df[col].astype(float)
    df = df[["close"]]
    df.to_parquet(cache_file)  # Save to cache
    print(f"Saving to cache: {cache_file}")
    return df

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions