Contributor guide + Compound liqs boilerplate

2021-08-02 05:16:41 -07:00 · 2021-08-02 05:16:41 -07:00 · c2660e4b1e
commit c2660e4b1e
parent 1a5524e6f3
10 changed files with 414 additions and 18 deletions
--- a/GUIDE.md
+++ b/GUIDE.md
@ -0,0 +1,267 @@
+# Contributor guide
+
+### Requirements
+
+* [Install](https://docs.docker.com/compose/install/) docker compose
+    * To run `mev-inspect`, `postgres`, and `pgadmin` within a local container.
+* Python
+    * Our pre-commit hook requires v3.9, use pyenv to manage versions and venv, instructions [here](https://www.andreagrandi.it/2020/10/10/install-python-with-pyenv-create-virtual-environment-with-specific-python-version/).
+    * Verify with `pre-commit install && pre-commit run --all-files` 
+* Archive node with `trace_*` rpc module (Erigon/OpenEthereum)
+
+    * If you do not have access to an archive node, reach out to us on our [discord](https://discord.gg/5NB53YEGVM) for raw traces (of the blocks with MEV you're writing inspectors for) or an rpc endpoint.
+### Quick start
+
+We use poetry for python package management, start with installing the required libraries: 
+* `poetry install`
+
+To build containers: 
+* `poetry run build`
+
+Run as daemon:
+* `poetry run start -d`
+
+Run inspect on a block:
+* `poetry run inspect --block-number 11931270 --rpc 'http://111.11.11.111:8545/'
+`
+
+Conversely, to stop: 
+* `poetry run stop`
+
+You will be able to run all the inspectors against a specific transaction, block, and range of blocks once we finalize our data model/architecture but for now, write a protocol specifc inspector script and verify against a test block (with the MEV you're trying to quantify). 
+
+Full list of poetry commands for this repo can be found [here](https://github.com/flashbots/mev-inspect-py#poetry-scripts). 
+
+
+### Tracing
+
+While simple ETH and token transfers are trivial to parse/filter (by processing their transaction input data, events and/or receipts), contract interactions can be complex to identify. EVM tracing allows us to dig deeper into the transaction execution cycle to look through the internal calls and any other additional proxy contracts the tx interacts with, this is useful for the comprehensive analysis we're interested in.
+
+Trace types (by `action_type`):
+
+* `Call`, which is returned when a method on a contract (same as the tx `to` field or a different one within) is executed. We can identify the input parameters in each instance by looking at this sub trace. 
+* `Self-destruct`, when a contract destroys the code at its address and transfers the ETH held in the contract to an EOA. Common pattern among arbitrage bots given the gas refund savings. 
+* `Create`, when a contract deploys another contract and transfers assets to it. 
+* `Reward`, pertaining to the block reward and uncle reward, not relevant here. 
+
+Note that this is for Erigon/OpenEthereum `trace` module and Geth has a different tracing mechanism that is more low-level/irrelevant for inspect.
+
+### Architecture
+
+
+#### Classified Traces
+
+For each block we intend to inspect, we first fetch all of its traces, transaction receipts, and other additional information. The raw traces are then processed into classified traces (with protocol name, relevant function signature, relevant call inputs, strategy classification and other information) before being passed onto individual inspectors. 
+
+If we notice these classified traces to only contain liquidation related functions, we can only pass them off to aave/comp inspectors. Similarly, arbitrage profits are reduced by running them through protocol inspectors tagged in this stage (swap/liquidate/buy tx followed by addLiquidity etc). 
+
+#### Strategy Inspectors
+
+Each strategy has its own inspector and we define the types based on what output we expect it to return. This could include net profits, but also other information such as whether it was a pre-flight check (querying the reserves to see if the arb is still available) or a successful mev opportunity. 
+
+TODO: generic types we've narrowed them down to
+TODO: table of inspectors pending/wip/ready
+
+#### Tokenflow
+
+This module is built to help us identify misclassifications and eventually be used as a protocol agnostic profit estimator (that can be imported by other inspectors after they identify target function signature in the traces) but for now, we'll be using it in addition to our inspectors and store the `diff` for reference purposes. 
+
+The method revolves around iterating over all the traces and makes a note of all the ETH inflows/outflows as well as stablecoins (USDT/USDC/DAI) for the main `eoa`, `contract` (to field, if it's not a known router/aggregator), `proxy` (helpers used by searcher, if any). Once it is done, it finds out net profit by subtracting the gas spent from the MEV revenue. All profits will be converted to ETH, based on the exchange rate at that block height. 
+
+Example: https://etherscan.io/tx/0x4121ce805d33e952b2e6103a5024f70c118432fd0370128d6d7845f9b2987922
+
+ETH=>ENG=>ETH across DEXs
+
+Script output: 
+EOA: 0x00000098163d8908dfbd126c873c9c4732a2c2e6
+Contract: 0x000000000000006f6502b7f2bbac8c30a3f67e9a
+Tx proxy: 0x0000000000000000000000000000000000000000
+Stablecoins inflow/outflow: [0, 0]
+Net ETH profit, Wei 22357881284770142 
+
+More examples can be found under `./tests/tokenflow_test.py`
+
+#### Database
+
+Final `mev_inspections` table schema:
+
+* As of `mev-inspect-rs`:
+    * hash
+    * status
+        * `Success` or `Reverted`
+    * block_number
+    * gas_price
+    * revenue
+        * Revenue searcher makes after accounting for gas used.
+    * protocols
+        * Different protocols that we identify the transaction to touch
+    * actions
+        * Different relevant actions parsed from the transaction traces
+    * eoa
+        * EOA address that initiates the transaction
+    * contract
+        * `to` field, either a custom contract utilized for a searcher to capture MEV or a simple router
+    * proxy_impl
+        * Proxy implementations used by searchers, if any
+    * inserted_at
+
+Additional fields we're potentially interested in (aside from inspector specific information): 
+* miner
+    * Coinbase address of the block miner
+* eth_usd_price
+    * Price of ETH that block height
+    * Similarly, for any tokens (say in an arbitrage inspection) we query against the relevant uniswap pools.
+* tail_gas_price
+    * Gas price of the transaction displaced in the block (last tx that would've otherwise)
+* tokenflow_estimate_in_eth
+    * Profit outputted by the token flow function
+* tokenflow_diff
+    * Difference between profit estimated by our inspectors and pure token flow analysis
+
+
+
+### Creating an inspector from scratch
+
+If you intend to create your own inspector and submit it as a PR, this should serve as a useful walkthrough to understand the code structure and types. 
+
+Compound V2 has [two](https://compound.finance/docs/ctokens#liquidate-borrow) primary kinds of protocol liquidations on-chain. `liquidateBorrow()` on the cEther contract and on individual cToken contracts. The former is when a liquidation bot repays ETH debt (via `msg.value`) to seize an account's collateral. The latter is when a liquidation bot repays cTokens (by pre-approving the contract) to liquidate an account.
+
+**Inspector to capture MEV from the first cEther scenario** 
+
+Target function breakdown, from the compound docs: 
+
+<pre>
+<i>function liquidateBorrow(address borrower, address cTokenCollateral) payable</i>
+
+msg.value payable: The amount of ether to be repaid and converted into collateral, in wei.
+msg.sender: The account which shall liquidate the borrower by repaying their debt and seizing their collateral.
+borrower: The account with negative account liquidity that shall be liquidated.
+cTokenCollateral: The address of the cToken currently held as collateral by a borrower, that the liquidator shall seize.
+
+RETURN: No return, reverts on error.</pre>
+
+
+Example [transaction](https://etherscan.io/tx/0xd09e499f2c2d6a900a974489215f25006a5a3fa401a10b8d67fa99480cbb62fb), found using function signature on [bloxy](https://bloxy.info/functions/aae40a2a), which also has [full execution trace](https://bloxy.info/tx/0xd09e499f2c2d6a900a974489215f25006a5a3fa401a10b8d67fa99480cbb62fb) that can be helpful for debugging.
+
+Flow: Classify traces => Parse traces with strategy inspector => Summarize before database insert
+
+#### Classify traces
+1. Add the contract ABI of the target function to `abis/protocol_version/`
+    a. In this instance, we create `CEther.json` under `abis/compound_v2/`
+    b. This is to ensure the `TraceClassifier` can utilize the ABI decoder (via `get_abi()`) when initialized in `scripts/inspect_block.py` 
+2. Add matching specs in `mev_inspect/schemas/classified_traces.py` (to identify above function/abi when turning raw traces into classified traces)
+    a. Add the following lines for each strategy/protocol
+    ```
+        class Classification(Enum):
+            unknown = "unknown"
+            swap = "swap"
+            burn = "burn"
+            transfer = "transfer"
+    +       liquidate_borrow_ceth = "liquidate_borrow_ceth" #strategy classification/identification name
+        
+
+        class Protocol(Enum):
+            uniswap_v2 = "uniswap_v2"
+            uniswap_v3 = "uniswap_v3"
+            sushiswap = "sushiswap"
+   +       compound_v2 = "compound_v2" #should match folder name of `abis`
+    ```
+    b. Under `mev_inspect/schemas/classified_specs.py`, mention the actual function signature you're tragetting and export it in `CLASSIFIER_SPECS`
+    ```
+        COMPOUND_V2_CETH_SPEC = ClassifierSpec(
+            abi_name="CEther", #should match abi json file name
+            protocol = Protocol.compound_v2,
+            classifications = {
+                "liquidateBorrow(address,address)": Classification.liquidate_borrow_ceth,
+            }
+        )
+    ```
+3. Setup a unit test (`test/liquidation_test.py` in this case) to verify the above example tx is being classified properly
+    a. `get_filtered_traces(tx_hash)` on `Block` class allows you to filter traces of a specific transaction (for the purpose of this inspector/test)
+```
+
+class TestCompoundV2Liquidation(unittest.TestCase):
+    def test_compound_v2_ceth_liquidation(self):
+        tx_hash = "0xd09e499f2c2d6a900a974489215f25006a5a3fa401a10b8d67fa99480cbb62fb"
+        block_no = 12900060
+        cache_path = _get_cache_path(block_no)
+        block_data = Block.parse_file(cache_path)
+        
+        tx_traces = block_data.get_filtered_traces(tx_hash)
+        trace_clasifier = TraceClassifier(CLASSIFIER_SPECS)
+        classified_traces = trace_clasifier.classify(tx_traces)
+        res = inspect_compound_v2_ceth(classified_traces)
+        ## res type => Liquidation class with the types defined later below
+        self.assertEqual(res.tx_hash, "0x0")
+        self.assertEqual(res.borrower, "0x0")
+        self.assertEqual(res.collateral_provided, "0x0")
+        self.assertEqual(res.collateral_provided_amount, 0)
+        self.assertEqual(res.asset_seized, "0x0")
+        self.assertEqual(res.asset_seized_amount, 0)
+        self.assertEqual(res.profit_in_eth, 0)
+        self.assertEqual(res.tokenflow_estimate_in_eth, 0)        
+        self.assertEqual(res.tokenflow_diff, 0)
+        self.assertEqual(res.status, LiquidationStatus.seized)        
+        self.assertEqual(res.type, LiquidationType.compound_v2_ceth_liquidation)
+        self.assertEqual(res.collateral_source, LiquidationCollateralSource.other)     
+```
+
+#### Parse traces with strategy inspector
+
+The custom logic for this scenario is handled here: `./mev_inspect/strategy_inspectors/compound_v2_ceth.py`, where we process the classified traces for profit data and additional information using `inspect_compound_v2_ceth(classified_traces: list[ClassifiedTrace]) -> Liquidation`. 
+
+Before writing the inspector we define the output type to be returned by this function in `./mev_inspect/schema/liquidations.py`, this is unique to each class of strategies (aribitrage/liquidation/sandwich/token sniping etc) to contain all the relevant MEV fields. 
+
+```
+from .utils import CamelModel
+from typing import Dict, List, Optional
+from enum import Enum
+
+class LiquidationType(Enum):
+    compound_v2_ceth_liquidation = "compound_v2_ceth_liquidation"
+    compound_v2_ctoken_liquidation = "compound_v2_ctoken_liquidation" # TODO: add logic to handle ctoken liquidations
+
+class LiquidationStatus(Enum):
+    seized = "seized" # succesfully completed
+    check = "check" # just a liquidation check. i.e searcher only checks if opportunity is still available and reverts accordingly
+    out_of_gas = "out_of_gas" # tx ran out of gas
+
+class LiquidationCollateralSource(Enum):
+    aave_flashloan = "aave_flashloan"
+    dydx_flashloan = "dydx_flashloan"
+    uniswap_flashloan = "uniswap_flashloan"
+    searcher_eoa = "searcher_eoa" # searchers own funds
+    other = "other"
+
+class Liquidation(CamelModel):
+    tx_hash: str
+    borrower: str # account that got liquidated
+    collateral_provided: str # collateral provided by searcher, 'ether' or token contract address
+    collateral_provided_amount: int # amount of collateral provided
+    asset_seized: str # asset that was given to searcher at a discount upon liquidation
+    asset_seized_amount: int # amount of asset that was given to searcher upon liquidation
+    profit_in_eth: int # profit estimated by strategy inspector
+    tokenflow_estimate_in_eth: int # profit estimated by tokenflow
+    tokenflow_diff: int # diff between tokenflow and strategy inspector
+    status: LiquidationStatus
+    type: LiquidationType
+    collateral_source: LiquidationCollateralSource
+```
+
+Finally, we get into the core logic ( `inspect_compound_v2_ceth()` in `./mev_inspect/strategy_inspectors/compound_v2_ceth.py`).
+
+```
+flow: 
+    1. decide if it's a pre-flight check tx or an actual liquidation
+    2. parse `liquidateBorrow` and `seize` sub traces to determine actual amounts sent to the protocol and send back to the searcher
+    3. calculate net profit by finding out the worth of seized tokens
+    4. use tokenflow module to find out profit independent of the inspector, calculate diff
+    5. determine source of funds 
+    6. prepare return object to get it ready for db processing
+```
+
+For every inspector, try to verify the profit amount by adding unit tests of sample txs that cover a wide variety of edge cases. Comparing the inspector outputs to that of tokenflow (`tokenflow_diff` in `Liquidation` class in this case) should also help catch misclassifications. 
+
+#### Summarize before database insert
+
+TODO: section about what ends up in the database from all the extracted information, after we finalize tables/schema
--- a/README.md
+++ b/README.md
@ -101,4 +101,4 @@ Install pre-commit with:
 poetry run pre-commit install
 ```

-Update README if needed
+[Full contributor guide with sample inspector](./GUIDE.MD)
--- a/cache/11931270-new.json
+++ b/cache/11931270-new.json
--- a/cache/12900060-new.json
+++ b/cache/12900060-new.json
--- a/mev_inspect/abis/compound_v2/CEther.json
+++ b/mev_inspect/abis/compound_v2/CEther.json
--- a/mev_inspect/classifier_specs.py
+++ b/mev_inspect/classifier_specs.py
@ -99,6 +99,13 @@ ERC20_SPEC = ClassifierSpec(
    },
 )

+COMPOUND_V2_CETH_SPEC = ClassifierSpec(
+    abi_name="CEther",
+    protocol=Protocol.compound_v2,
+    classifications={
+        "liquidateBorrow(address,address)": Classification.liquidate_borrow_ceth,
+    },
+)

 CLASSIFIER_SPECS = [
    *UNISWAP_V3_CONTRACT_SPECS,
@ -106,4 +113,5 @@ CLASSIFIER_SPECS = [
    ERC20_SPEC,
    UNISWAP_V3_POOL_SPEC,
    UNISWAPPY_V2_PAIR_SPEC,
+    COMPOUND_V2_CETH_SPEC,
 ]
--- a/mev_inspect/schemas/classified_traces.py
+++ b/mev_inspect/schemas/classified_traces.py
@ -11,12 +11,14 @@ class Classification(Enum):
    swap = "swap"
    burn = "burn"
    transfer = "transfer"
+    liquidate_borrow_ceth = "liquidate_borrow_ceth"


 class Protocol(Enum):
    uniswap_v2 = "uniswap_v2"
    uniswap_v3 = "uniswap_v3"
    sushiswap = "sushiswap"
+    compound_v2 = "compound_v2"


 class ClassifiedTrace(BaseModel):
--- a/mev_inspect/schemas/liquidations.py
+++ b/mev_inspect/schemas/liquidations.py
@ -0,0 +1,36 @@
+from enum import Enum
+from .utils import CamelModel
+
+
+class LiquidationType(Enum):
+    compound_v2_ceth_liquidation = "compound_v2_ceth_liquidation"
+    compound_v2_ctoken_liquidation = "compound_v2_ctoken_liquidation"  # TODO: add logic to handle ctoken liquidations
+
+
+class LiquidationStatus(Enum):
+    seized = "seized"  # succesfully completed
+    check = "check"  # just a liquidation check. i.e searcher only checks if opportunity is still available and reverts accordingly
+    out_of_gas = "out_of_gas"  # tx ran out of gas
+
+
+class LiquidationCollateralSource(Enum):
+    aave_flashloan = "aave_flashloan"
+    dydx_flashloan = "dydx_flashloan"
+    uniswap_flashloan = "uniswap_flashloan"
+    searcher_eoa = "searcher_eoa"  # searchers own funds
+    other = "other"
+
+
+class Liquidation(CamelModel):
+    tx_hash: str
+    borrower: str  # account that got liquidated
+    collateral_provided: str  # collateral provided by searcher, 'ether' or token contract address
+    collateral_provided_amount: int  # amount of collateral provided
+    asset_seized: str  # asset that was given to searcher at a discount upon liquidation
+    asset_seized_amount: int  # amount of asset that was given to searcher upon liquidation
+    profit_in_eth: int  # profit estimated by strategy inspector
+    tokenflow_estimate_in_eth: int  # profit estimated by tokenflow
+    tokenflow_diff: int  # diff between tokenflow and strategy inspector
+    status: LiquidationStatus
+    type: LiquidationType
+    collateral_source: LiquidationCollateralSource
--- a/mev_inspect/strategy_inspectors/compound_v2_ceth.py
+++ b/mev_inspect/strategy_inspectors/compound_v2_ceth.py
@ -0,0 +1,60 @@
+from mev_inspect.schemas.classified_traces import ClassifiedTrace
+from mev_inspect.schemas.liquidations import (
+    Liquidation,
+    LiquidationType,
+    LiquidationStatus,
+)
+
+
+# TODO: check tx status and assign accordingly
+# i.e if a tx checks if the opportunity is still available ("liquidateBorrowAllowed")
+# or if it calls the COMP oracle for price data ("getUnderlyingPrice(address")
+def is_pre_flight():
+    pass
+
+
+# TODO: fetch historic price (in ETH) of any given token at the block height the tx occured
+# to calculate the profit in ETH accurately, regardless of what token the profit was held in
+def get_historic_token_price():
+    pass
+
+
+# TODO: for any given cToken, get the underlying token from the comptroller markets
+# i.e cDAI => DAI
+def get_underlying_ctoken_asset():
+    pass
+
+
+# TODO: find if the searcher repays the loan from their own EOA, by buying it from a DEX, or w/ a flashloan
+def find_collateral_source():
+    pass
+
+
+def inspect_compound_v2_ceth(classified_traces: list[ClassifiedTrace]) -> Liquidation:
+    # TODO: complete this logic after asking about type choices
+
+    # flow:
+    # 1. decide if it's a pre-flight check tx or an actual liquidation
+    # 2. parse `liquidateBorrow` and `seize` sub traces to determine actual amounts
+    # 3. calculate net profit by finding out the worth of seized tokens
+    # 4. use tokenflow module to find out profit independent of the inspector, calculate diff
+    # 5. prepare return object to get it ready for db processing
+
+    for classified_trace in classified_traces:
+        if classified_trace.function_name == "liquidateBorrow":
+            liquidation = Liquidation(
+                tx_hash="0x0",
+                borrower="0x0",
+                collateral_provided="0x0",
+                collateral_provided_amount=0,
+                asset_seized="0x0",
+                asset_seized_amount=0,
+                profit_in_eth=0,
+                tokenflow_estimate_in_eth=0,
+                collateral_source="other",
+                tokenflow_diff=0,
+                status=LiquidationStatus.seized,
+                type=LiquidationType.compound_v2_ceth_liquidation,
+            )
+
+    return liquidation
--- a/tests/liquidation_test.py
+++ b/tests/liquidation_test.py
@ -1,22 +1,42 @@
 import unittest
+from mev_inspect.trace_classifier import TraceClassifier
+from mev_inspect.classifier_specs import CLASSIFIER_SPECS
+from mev_inspect.block import _get_cache_path
+from mev_inspect.strategy_inspectors.compound_v2_ceth import inspect_compound_v2_ceth

-# Fails precommit because these inspectors don't exist yet
-# from mev_inspect import inspector_compound
-# from mev_inspect import inspector_aave
-#
-#
-# class TestLiquidations(unittest.TestCase):
-#     def test_compound_liquidation(self):
-#         tx_hash = "0x0ec6d5044a47feb3ceb647bf7ea4ffc87d09244d629eeced82ba17ec66605012"
-#         block_no = 11338848
-#         res = inspector_compound.get_profit(tx_hash, block_no)
-#         # self.assertEqual(res['profit'], 0)
-#
-#     def test_aave_liquidation(self):
-#         tx_hash = "0xc8d2501d28800b1557eb64c5d0e08fd6070c15b6c04c39ca05631f641d19ffb2"
-#         block_no = 10803840
-#         res = inspector_aave.get_profit(tx_hash, block_no)
-#         # self.assertEqual(res['profit'], 0)
+# from mev_inspect.schemas.classified_traces import Classification
+from mev_inspect.schemas.liquidations import (
+    LiquidationCollateralSource,
+    LiquidationType,
+    LiquidationStatus,
+)
+from mev_inspect.schemas import Block
+
+
+class TestCompoundV2Liquidation(unittest.TestCase):
+    def test_compound_v2_ceth_liquidation(self):
+        tx_hash = "0xd09e499f2c2d6a900a974489215f25006a5a3fa401a10b8d67fa99480cbb62fb"
+        block_no = 12900060
+        cache_path = _get_cache_path(block_no)
+        block_data = Block.parse_file(cache_path)
+
+        tx_traces = block_data.get_filtered_traces(tx_hash)
+        trace_clasifier = TraceClassifier(CLASSIFIER_SPECS)
+        classified_traces = trace_clasifier.classify(tx_traces)
+        res = inspect_compound_v2_ceth(classified_traces)
+
+        self.assertEqual(res.tx_hash, "0x0")
+        self.assertEqual(res.borrower, "0x0")
+        self.assertEqual(res.collateral_provided, "0x0")
+        self.assertEqual(res.collateral_provided_amount, 0)
+        self.assertEqual(res.asset_seized, "0x0")
+        self.assertEqual(res.asset_seized_amount, 0)
+        self.assertEqual(res.profit_in_eth, 0)
+        self.assertEqual(res.tokenflow_estimate_in_eth, 0)
+        self.assertEqual(res.tokenflow_diff, 0)
+        self.assertEqual(res.status, LiquidationStatus.seized)
+        self.assertEqual(res.type, LiquidationType.compound_v2_ceth_liquidation)
+        self.assertEqual(res.collateral_source, LiquidationCollateralSource.other)


 if __name__ == "__main__":