Tracking DeFi Portfolio History with CSVs
Why I started with boring exports
Portfolio tracking sounds simple until you try to reconstruct what happened after a few weeks of swaps, bridges, LP deposits, airdrops, approvals, and lending transactions. Wallet balances show the current state. They do not explain how you got there.
For this step I wanted a local history I could inspect without handing every wallet to another hosted dashboard. I used rotki as the main open-source tracker, explorer exports as the raw evidence, and a small CSV ledger to catch the parts that automated tools mislabel.
The tool stack I used
I kept the setup intentionally plain:
| Layer | Tool | Why it is there |
|---|---|---|
| Portfolio tracker | rotki | Open-source, self-hosted, local-first portfolio and accounting app |
| Raw chain history | Etherscan-style explorer/API exports | Verifiable transaction and ERC-20 transfer history |
| Manual ledger | CSV | Portable fallback when a protocol interaction is hard to classify |
| Price checks | CoinGecko historical API | Spot checks for USD marks on days where the tracker looked wrong |
rotki is the main piece because it is built for crypto portfolio tracking and accounting, and its code is public under AGPLv3. rotki GitHub repo Its docs describe it as an open-source asset management and accounting application that tracks blockchain and exchange activity. rotki docs
That does not mean I trust it blindly. Portfolio trackers infer intent from messy data. A token transfer might be a swap output, bridge receipt, staking reward, spam token, reimbursement, or internal wallet move. The tool gets me 80-90% of the way there. CSVs are how I keep the last mile auditable.
The data model that did not break
The mistake I made first was trying to track “positions” directly. Positions change shape. A lending deposit becomes an aToken. A Uniswap V3 LP becomes an NFT. A bridge burns one representation and mints another somewhere else.
The better unit is an event.
timestamp,chain,tx_hash,event_type,asset,amount,usd_value,fee_asset,fee_amount,notes
For a small wallet, that schema was enough. I used these event types:
| Event type | Meaning |
|---|---|
deposit | Funds entered a protocol |
withdraw | Funds left a protocol |
swap_in | Asset received in a swap |
swap_out | Asset spent in a swap |
bridge_out | Asset left the source chain |
bridge_in | Asset arrived on the destination chain |
reward | Claim, airdrop, incentive, or interest |
fee | Gas, bridge fee, protocol fee, or service fee |
transfer | Wallet-to-wallet movement I control |
ignore | Spam or noise I do not want in performance math |
The important part is not the exact labels. The important part is that every row has a transaction hash, chain, asset, amount, and explanation. If I cannot explain a row later, it does not belong in performance reporting yet.
Importing the easy part into rotki
I added the wallet addresses first. rotki supports importing and exporting blockchain accounts as CSV, and its docs list EVM chains including Ethereum, Optimism, Polygon PoS, Arbitrum One, Base, Gnosis, Scroll, BNB Smart Chain, Avalanche, and zkSync Lite. rotki account tracking docs
For this experiment I tracked:
- one main EVM wallet
- one old test wallet
- Arbitrum and Base activity
- no centralized exchange API keys
That last point matters. I did not want read-only exchange keys in the first pass. If you trade heavily on CEXs, you probably need them eventually. For this learning step, wallet history plus manual CSVs was enough.
The rotki CSV import path is useful when a source is not directly integrated. Its docs support CSV imports for exchange-style data and custom date formats. rotki CSV import docs I treated that as the escape hatch: if rotki did not classify something cleanly, I normalized the source data into my own ledger instead of forcing it through a bad category.
Pulling raw ERC-20 transfers
For raw transfer history, I prefer API exports over clicking around in explorer UIs. The Etherscan V2 API has an account endpoint for ERC-20 token transfers by address, with parameters for chainid, address, contractaddress, startblock, endblock, pagination, and sort order. Etherscan token transfer endpoint
Here is the small script I used to turn API rows into a ledger-like CSV. It does not decode swaps. It only normalizes transfers so I can review them.
import csv
import os
from datetime import datetime, timezone
from decimal import Decimal
import requests
API_KEY = os.environ["ETHERSCAN_API_KEY"]
WALLET = "0xYourWalletAddress"
CHAIN_ID = "8453" # Base. Use "1" for Ethereum, "42161" for Arbitrum.
def fetch_token_transfers(address: str) -> list[dict]:
"""Fetch ERC-20 transfers for one wallet from the Etherscan V2 API."""
rows: list[dict] = []
page = 1
while True:
response = requests.get(
"https://api.etherscan.io/v2/api",
params={
"chainid": CHAIN_ID,
"module": "account",
"action": "tokentx",
"address": address,
"startblock": 0,
"endblock": 999999999,
"page": page,
"offset": 1000,
"sort": "asc",
"apikey": API_KEY,
},
timeout=30,
)
response.raise_for_status()
payload = response.json()
if payload["status"] == "0" and payload["message"] == "No transactions found":
break
if payload["status"] != "1":
raise RuntimeError(payload)
batch = payload["result"]
rows.extend(batch)
if len(batch) < 1000:
break
page += 1
return rows
def normalize_transfer(row: dict, wallet: str) -> dict:
"""Convert one explorer row into a portable ledger row."""
decimals = int(row["tokenDecimal"])
amount = Decimal(row["value"]) / (Decimal(10) ** decimals)
direction = "in" if row["to"].lower() == wallet.lower() else "out"
signed_amount = amount if direction == "in" else -amount
timestamp = datetime.fromtimestamp(int(row["timeStamp"]), tz=timezone.utc)
return {
"timestamp": timestamp.isoformat(),
"chain": CHAIN_ID,
"tx_hash": row["hash"],
"event_type": "transfer",
"asset": row["tokenSymbol"],
"amount": str(signed_amount),
"usd_value": "",
"fee_asset": "",
"fee_amount": "",
"notes": f"{direction} via {row['contractAddress']}",
}
if __name__ == "__main__":
transfers = fetch_token_transfers(WALLET)
with open("portfolio-ledger.csv", "w", newline="") as handle:
writer = csv.DictWriter(
handle,
fieldnames=[
"timestamp",
"chain",
"tx_hash",
"event_type",
"asset",
"amount",
"usd_value",
"fee_asset",
"fee_amount",
"notes",
],
)
writer.writeheader()
for transfer in transfers:
writer.writerow(normalize_transfer(transfer, WALLET))
The first useful output was not a chart. It was a list of rows where the same transaction hash appeared multiple times. Those were swaps, bridge settlements, LP deposits, and protocol interactions. I marked those for manual review instead of pretending that each transfer was an independent trade.
Reconciling tracker balances against the ledger
After importing the wallet into rotki, I compared three views:
- wallet balance shown in the wallet
- rotki balance by asset
- CSV ledger sum by asset
The check is deliberately simple:
import csv
from collections import defaultdict
from decimal import Decimal
def ledger_balances(path: str) -> dict[str, Decimal]:
"""Sum non-ignored ledger rows by asset."""
balances: dict[str, Decimal] = defaultdict(Decimal)
with open(path, newline="") as handle:
for row in csv.DictReader(handle):
if row["event_type"] == "ignore":
continue
if not row["amount"]:
continue
balances[row["asset"]] += Decimal(row["amount"])
return dict(sorted(balances.items()))
if __name__ == "__main__":
for asset, amount in ledger_balances("portfolio-ledger.csv").items():
print(f"{asset}: {amount.normalize()}")
This caught the obvious issues:
- wrapped and unwrapped assets counted separately
- bridge-out rows with no matching bridge-in row yet
- spam tokens appearing as assets
- approvals showing up in explorer history but not as balance-changing events
- LP position value missing because the position is represented by an NFT, not a normal ERC-20 balance
None of this is surprising. It is exactly why I want the raw CSV nearby. The tracker is good at current balances. The ledger is better at explaining how a balance changed.
Adding USD marks without overbuilding it
For taxes or precise PnL, price marking gets complicated fast. For this step I only needed approximate USD marks so the history chart made sense.
CoinGecko documents historical market chart endpoints for price, market cap, and volume data. CoinGecko historical data docs I used that for spot checks, not as a full pricing engine.
The manual rule was:
- stablecoins: mark at $1 unless the transaction happened during a known depeg window
- ETH/WETH: use daily historical ETH/USD close
- tiny spam tokens: mark as blank and
ignore - LP NFTs: value manually from the protocol UI and record the source in
notes
That is not institutional accounting. It is good enough for a sub-$1k learning wallet where the goal is to understand flows, not file a complex tax lot report.
What the first pass showed
The wallet looked clean in the UI. The CSV was messier.
| Category | Count | Notes |
|---|---|---|
| Normal transfers | 18 | Funding, wallet moves, small repayments |
| Swaps | 9 | Each produced multiple token transfer rows |
| Bridge events | 4 | Two source rows, two destination rows |
| Protocol deposits/withdrawals | 5 | Aave and LP actions needed manual labels |
| Approvals | 11 | Not balance-changing, but still useful operational history |
| Spam/noise | 7 | Marked ignore |
The main thing I learned: portfolio history is not the same as transaction history. Transaction history is raw evidence. Portfolio history is an accounting view built on top of that evidence.
Risk analysis
Technical risk: incomplete indexing
Explorers and portfolio tools can miss decoded context even when they show the transaction. An ERC-20 transfer export does not automatically tell you whether the transfer came from a swap, bridge, LP mint, reward claim, or internal contract movement. Mitigation: keep transaction hashes in every ledger row and review multi-transfer hashes manually.
Technical risk: chain coverage gaps
rotki supports many chains, but no tracker supports every protocol interaction perfectly. New vaults, bridge wrappers, perps positions, and LP NFTs are where accounting usually gets rough. Mitigation: track unsupported positions as manual events until the tool catches up.
Economic risk: wrong price marks
Historical USD values can be wrong if you use daily prices for intraday trades, assume stablecoins were always worth $1, or ignore thin-liquidity tokens. For a small wallet this may only move the chart by a few dollars. For taxes or larger balances, it can materially distort realized gains. Mitigation: keep raw token amounts separate from USD marks so you can reprice later.
Operational risk: CSV drift
Manual ledgers rot when you update them only after the fact. The longer the gap, the more likely you forget why a transfer happened. Mitigation: export weekly, label unknown rows immediately, and keep one notes field written in plain English.
Privacy risk: hosted dashboards
Portfolio dashboards are convenient, but connecting every wallet gives the service a detailed behavioral map. Even read-only access can reveal strategy, balances, counterparties, and timing. Mitigation: use local-first tools where possible, avoid unnecessary API keys, and keep doxxed and private wallets separated.
Practical takeaways
rotki is a good default starting point if you want an open-source tracker and local storage. It gives structure without forcing you into a hosted dashboard.
CSV exports are still necessary. They are the audit trail when the tracker misclassifies a bridge, misses an LP NFT, or shows a balance without the story behind it.
Do not start by building a full accounting system. Start with one portable ledger, one row per event, and transaction hashes everywhere.
For a lean wallet, approximate USD marks are fine as long as raw token amounts stay intact. You can always reprice later. You cannot recreate missing context if you never wrote it down.
The next thing I want is a tiny script that reads balances and allowances directly from the chain. Portfolio history tells me what happened. Allowance history tells me what can still happen if I forget to clean up approvals.