Home

Posts

Tracking DeFi Portfolio History with CSVs

May 28, 2026

#portfolio tracking

Why I started with boring exports

Portfolio tracking sounds simple until you try to reconstruct what happened after a few weeks of swaps, bridges, LP deposits, airdrops, approvals, and lending transactions. Wallet balances show the current state. They do not explain how you got there.

For this step I wanted a local history I could inspect without handing every wallet to another hosted dashboard. I used rotki as the main open-source tracker, explorer exports as the raw evidence, and a small CSV ledger to catch the parts that automated tools mislabel.

The tool stack I used

I kept the setup intentionally plain:

Layer	Tool	Why it is there
Portfolio tracker	rotki	Open-source, self-hosted, local-first portfolio and accounting app
Raw chain history	Etherscan-style explorer/API exports	Verifiable transaction and ERC-20 transfer history
Manual ledger	CSV	Portable fallback when a protocol interaction is hard to classify
Price checks	CoinGecko historical API	Spot checks for USD marks on days where the tracker looked wrong

rotki is the main piece because it is built for crypto portfolio tracking and accounting, and its code is public under AGPLv3. rotki GitHub repo Its docs describe it as an open-source asset management and accounting application that tracks blockchain and exchange activity. rotki docs

That does not mean I trust it blindly. Portfolio trackers infer intent from messy data. A token transfer might be a swap output, bridge receipt, staking reward, spam token, reimbursement, or internal wallet move. The tool gets me 80-90% of the way there. CSVs are how I keep the last mile auditable.

The data model that did not break

The mistake I made first was trying to track “positions” directly. Positions change shape. A lending deposit becomes an aToken. A Uniswap V3 LP becomes an NFT. A bridge burns one representation and mints another somewhere else.

The better unit is an event.

timestamp,chain,tx_hash,event_type,asset,amount,usd_value,fee_asset,fee_amount,notes

For a small wallet, that schema was enough. I used these event types:

Event type	Meaning
`deposit`	Funds entered a protocol
`withdraw`	Funds left a protocol
`swap_in`	Asset received in a swap
`swap_out`	Asset spent in a swap
`bridge_out`	Asset left the source chain
`bridge_in`	Asset arrived on the destination chain
`reward`	Claim, airdrop, incentive, or interest
`fee`	Gas, bridge fee, protocol fee, or service fee
`transfer`	Wallet-to-wallet movement I control
`ignore`	Spam or noise I do not want in performance math

The important part is not the exact labels. The important part is that every row has a transaction hash, chain, asset, amount, and explanation. If I cannot explain a row later, it does not belong in performance reporting yet.

Importing the easy part into rotki

I added the wallet addresses first. rotki supports importing and exporting blockchain accounts as CSV, and its docs list EVM chains including Ethereum, Optimism, Polygon PoS, Arbitrum One, Base, Gnosis, Scroll, BNB Smart Chain, Avalanche, and zkSync Lite. rotki account tracking docs

For this experiment I tracked:

one main EVM wallet
one old test wallet
Arbitrum and Base activity
no centralized exchange API keys

That last point matters. I did not want read-only exchange keys in the first pass. If you trade heavily on CEXs, you probably need them eventually. For this learning step, wallet history plus manual CSVs was enough.

The rotki CSV import path is useful when a source is not directly integrated. Its docs support CSV imports for exchange-style data and custom date formats. rotki CSV import docs I treated that as the escape hatch: if rotki did not classify something cleanly, I normalized the source data into my own ledger instead of forcing it through a bad category.

Pulling raw ERC-20 transfers

For raw transfer history, I prefer API exports over clicking around in explorer UIs. The Etherscan V2 API has an account endpoint for ERC-20 token transfers by address, with parameters for chainid, address, contractaddress, startblock, endblock, pagination, and sort order. Etherscan token transfer endpoint

Here is the small script I used to turn API rows into a ledger-like CSV. It does not decode swaps. It only normalizes transfers so I can review them.

import csv
import os
from datetime import datetime, timezone
from decimal import Decimal

import requests


API_KEY = os.environ["ETHERSCAN_API_KEY"]
WALLET = "0xYourWalletAddress"
CHAIN_ID = "8453"  # Base. Use "1" for Ethereum, "42161" for Arbitrum.


def fetch_token_transfers(address: str) -> list[dict]:
    """Fetch ERC-20 transfers for one wallet from the Etherscan V2 API."""
    rows: list[dict] = []
    page = 1

    while True:
        response = requests.get(
            "https://api.etherscan.io/v2/api",
            params={
                "chainid": CHAIN_ID,
                "module": "account",
                "action": "tokentx",
                "address": address,
                "startblock": 0,
                "endblock": 999999999,
                "page": page,
                "offset": 1000,
                "sort": "asc",
                "apikey": API_KEY,
            },
            timeout=30,
        )
        response.raise_for_status()
        payload = response.json()

        if payload["status"] == "0" and payload["message"] == "No transactions found":
            break
        if payload["status"] != "1":
            raise RuntimeError(payload)

        batch = payload["result"]
        rows.extend(batch)

        if len(batch) < 1000:
            break
        page += 1

    return rows


def normalize_transfer(row: dict, wallet: str) -> dict:
    """Convert one explorer row into a portable ledger row."""
    decimals = int(row["tokenDecimal"])
    amount = Decimal(row["value"]) / (Decimal(10) ** decimals)
    direction = "in" if row["to"].lower() == wallet.lower() else "out"
    signed_amount = amount if direction == "in" else -amount
    timestamp = datetime.fromtimestamp(int(row["timeStamp"]), tz=timezone.utc)

    return {
        "timestamp": timestamp.isoformat(),
        "chain": CHAIN_ID,
        "tx_hash": row["hash"],
        "event_type": "transfer",
        "asset": row["tokenSymbol"],
        "amount": str(signed_amount),
        "usd_value": "",
        "fee_asset": "",
        "fee_amount": "",
        "notes": f"{direction} via {row['contractAddress']}",
    }


if __name__ == "__main__":
    transfers = fetch_token_transfers(WALLET)

    with open("portfolio-ledger.csv", "w", newline="") as handle:
        writer = csv.DictWriter(
            handle,
            fieldnames=[
                "timestamp",
                "chain",
                "tx_hash",
                "event_type",
                "asset",
                "amount",
                "usd_value",
                "fee_asset",
                "fee_amount",
                "notes",
            ],
        )
        writer.writeheader()
        for transfer in transfers:
            writer.writerow(normalize_transfer(transfer, WALLET))

The first useful output was not a chart. It was a list of rows where the same transaction hash appeared multiple times. Those were swaps, bridge settlements, LP deposits, and protocol interactions. I marked those for manual review instead of pretending that each transfer was an independent trade.

Reconciling tracker balances against the ledger

After importing the wallet into rotki, I compared three views:

wallet balance shown in the wallet
rotki balance by asset
CSV ledger sum by asset

The check is deliberately simple:

import csv
from collections import defaultdict
from decimal import Decimal


def ledger_balances(path: str) -> dict[str, Decimal]:
    """Sum non-ignored ledger rows by asset."""
    balances: dict[str, Decimal] = defaultdict(Decimal)

    with open(path, newline="") as handle:
        for row in csv.DictReader(handle):
            if row["event_type"] == "ignore":
                continue
            if not row["amount"]:
                continue
            balances[row["asset"]] += Decimal(row["amount"])

    return dict(sorted(balances.items()))


if __name__ == "__main__":
    for asset, amount in ledger_balances("portfolio-ledger.csv").items():
        print(f"{asset}: {amount.normalize()}")

This caught the obvious issues:

wrapped and unwrapped assets counted separately
bridge-out rows with no matching bridge-in row yet
spam tokens appearing as assets
approvals showing up in explorer history but not as balance-changing events
LP position value missing because the position is represented by an NFT, not a normal ERC-20 balance

None of this is surprising. It is exactly why I want the raw CSV nearby. The tracker is good at current balances. The ledger is better at explaining how a balance changed.

Adding USD marks without overbuilding it

For taxes or precise PnL, price marking gets complicated fast. For this step I only needed approximate USD marks so the history chart made sense.

CoinGecko documents historical market chart endpoints for price, market cap, and volume data. CoinGecko historical data docs I used that for spot checks, not as a full pricing engine.

The manual rule was:

stablecoins: mark at $1 unless the transaction happened during a known depeg window
ETH/WETH: use daily historical ETH/USD close
tiny spam tokens: mark as blank and ignore
LP NFTs: value manually from the protocol UI and record the source in notes

That is not institutional accounting. It is good enough for a sub-$1k learning wallet where the goal is to understand flows, not file a complex tax lot report.

What the first pass showed

The wallet looked clean in the UI. The CSV was messier.

Category	Count	Notes
Normal transfers	18	Funding, wallet moves, small repayments
Swaps	9	Each produced multiple token transfer rows
Bridge events	4	Two source rows, two destination rows
Protocol deposits/withdrawals	5	Aave and LP actions needed manual labels
Approvals	11	Not balance-changing, but still useful operational history
Spam/noise	7	Marked `ignore`

The main thing I learned: portfolio history is not the same as transaction history. Transaction history is raw evidence. Portfolio history is an accounting view built on top of that evidence.

Risk analysis

Technical risk: incomplete indexing

Explorers and portfolio tools can miss decoded context even when they show the transaction. An ERC-20 transfer export does not automatically tell you whether the transfer came from a swap, bridge, LP mint, reward claim, or internal contract movement. Mitigation: keep transaction hashes in every ledger row and review multi-transfer hashes manually.

Technical risk: chain coverage gaps

rotki supports many chains, but no tracker supports every protocol interaction perfectly. New vaults, bridge wrappers, perps positions, and LP NFTs are where accounting usually gets rough. Mitigation: track unsupported positions as manual events until the tool catches up.

Economic risk: wrong price marks

Historical USD values can be wrong if you use daily prices for intraday trades, assume stablecoins were always worth $1, or ignore thin-liquidity tokens. For a small wallet this may only move the chart by a few dollars. For taxes or larger balances, it can materially distort realized gains. Mitigation: keep raw token amounts separate from USD marks so you can reprice later.

Operational risk: CSV drift

Manual ledgers rot when you update them only after the fact. The longer the gap, the more likely you forget why a transfer happened. Mitigation: export weekly, label unknown rows immediately, and keep one notes field written in plain English.

Privacy risk: hosted dashboards

Portfolio dashboards are convenient, but connecting every wallet gives the service a detailed behavioral map. Even read-only access can reveal strategy, balances, counterparties, and timing. Mitigation: use local-first tools where possible, avoid unnecessary API keys, and keep doxxed and private wallets separated.

Practical takeaways

rotki is a good default starting point if you want an open-source tracker and local storage. It gives structure without forcing you into a hosted dashboard.

CSV exports are still necessary. They are the audit trail when the tracker misclassifies a bridge, misses an LP NFT, or shows a balance without the story behind it.

Do not start by building a full accounting system. Start with one portable ledger, one row per event, and transaction hashes everywhere.

For a lean wallet, approximate USD marks are fine as long as raw token amounts stay intact. You can always reprice later. You cannot recreate missing context if you never wrote it down.

The next thing I want is a tiny script that reads balances and allowances directly from the chain. Portfolio history tells me what happened. Allowance history tells me what can still happen if I forget to clean up approvals.