{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 02 \u2014 Disease signature\n", "\n", "Week 1 (PLAN.md \u00a76). Pull Open Targets + a GEO expression study, run differential expression, and build `sickle_cell_signature_v1.json` (Tier A) with full provenance.\n\nSteps: (1) Open Targets associations, (2) choose + download GEO dataset, (3) differential expression, (4) build + persist signature.\n\n**Pitfall to document:** whole-blood expression is partly driven by cell-composition differences, not disease state (PLAN.md \u00a79.1)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "sys.path.insert(0, '..') # import the src package from notebooks/" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from src import disease\n", "from src.provenance import ConfidenceTier\n", "\n", "# Step 1: Open Targets associations for MONDO:0011382 -> data/raw/open_targets/\n", "# Step 2: choose + download GEO study (GSE53441 / GSE35007 / newer) -> data/raw/geo/\n", "# Step 3: disease.compute_differential_expression(...)\n", "# Step 4: disease.build_signature(...) then disease.persist_signature(...)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }