CFDE Gene-Centric Appyter: ORC6

Given the gene ORC6, we request information about it from several different DCCs in hopes of creating a comprehensive knowledge report for it.

MyGeneInfo: Query

https://mygene.info/

To interoperate with different APIs which support different gene identifier schemes. We'll first use mygene.info to resolve gene identifiers.

{
    "took": 24,
    "total": 796,
    "max_score": 139.63954,
    "hits": [10 items]
}

GeneID: 23594

MyGeneInfo

https://mygene.info/

With the Entrez Gene ID, we can resolve lots of different identifiers and identifiability information from mygene.info.

{
    "AllianceGenome": "17151",
    "HGNC": "17151",
    "MIM": "607213",
    "_id": "23594",
    "_version": 1,
    "accession": {4 items},
    "agr": {1 item},
    "alias": "ORC6L",
    "clingen": {2 items},
    "ensembl": {5 items},
    "entrezgene": "23594",
    "exac": {9 items},
    "exons": [2 items],
    "exons_hg19": [2 items],
    "generif": [21 items],
    "genomic_pos": {5 items},
    "genomic_pos_hg19": {4 items},
    "go": {3 items},
    "homologene": {2 items},
    "interpro": [2 items],
    "ipi": "IPI00001641",
    "map_location": "16q11.2",
    "name": "origin recognition complex subunit 6",
    "other_names": "origin recognition complex subunit 6",
    "pantherdb": {4 items},
    "pathway": {4 items},
    "pdb": [2 items],
    "pfam": "PF05460",
    "pharmgkb": "PA32813",
    "pharos": {2 items},
    "reagent": {5 items},
    "refseq": {4 items},
    "reporter": {5 items},
    "summary": "The origin recognition complex (ORC) is a highly conserved six subunit protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional initiation factors such as Cdc6 and Mcm proteins. The protein encoded by this gene is a subunit of the ORC complex. Gene silencing studies with small interfering RNA demonstrated that this protein plays an essential role in coordinating chromosome replication and segregation with cytokinesis. [provided by RefSeq, Oct 2010].",
    "symbol": "ORC6",
    "taxid": 9606,
    "type_of_gene": "protein-coding",
    "umls": {1 item},
    "unigene": "Hs.49760",
    "uniprot": {2 items},
    "wikipedia": {1 item}
}

Gene Symbol: ORC6


Primary Information

We query DCC APIs to gain insights about the primary information they collect.

GTEx

https://gtexportal.org/home/

We query the GTEx Data through the GTEx API to identify tissue sites that significantly express the gene question.

Gene with identifier ORC6 currently not available in GTEx
Could not process GTEx output

LINCS

https://lincsproject.org/

L1000 RNAseq Gene Centric Signature Reverse Search (RGCSRS)

An appyter was built for performing Gene Centric signature reverse searches against the LINCS data. Its functionality is repeated here.

BokehJS 2.4.2 successfully loaded.
Top CRISPR KO signatures where ORC6 is up-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
XPR042_KELLY.311_96H_M01_EPHB2 0.0228 1.0901 0.124494 2599.0 EPHB2 KELLY.311 96h
XPR014_HT29.311_96H_D04_CLEC12B 0.0210 1.0889 0.122807 4721.0 CLEC12B HT29.311 96h
XPR015_A549.311_96H_F10_DGKK 0.0213 1.0862 0.119257 1869.0 DGKK A549.311 96h
XPR010_A549.311_96H_N22_GRIA2 0.0213 1.0855 0.118309 2299.0 GRIA2 A549.311 96h
XPR043_A549.311_96H_F03_HSP90AB1 0.0259 1.0849 0.117596 2878.0 HSP90AB1 A549.311 96h
XPR024_A375.311_96H_O01_SIRPG 0.0232 1.0835 0.115679 3781.0 SIRPG A375.311 96h
XPR014_HT29.311_96H_L08_CLIC4 0.0241 1.0811 0.112458 5378.0 CLIC4 HT29.311 96h
XPR013_PC3.311B_96H_G17_PARP1 0.0212 1.0701 0.097760 2099.0 PARP1 PC3.311B 96h
XPR016_MCF7.311_96H_H05_EML4 0.0217 1.0682 0.095116 2148.0 EML4 MCF7.311 96h
XPR016_HT29.311_96H_A15_DOK6 0.0240 1.0673 0.093991 1802.0 DOK6 HT29.311 96h
Top CRISPR KO signatures where ORC6 is down-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
XPR033_ES2.311_96H_C10_CDK4 -0.0231 0.7300 -0.454006 2490.0 CDK4 ES2.311 96h
XPR021_MCF7.311_96H_K08_NCOA3 -0.0214 0.7912 -0.337798 1669.0 NCOA3 MCF7.311 96h
XPR043_A549.311_96H_E14_VCP -0.0240 0.8019 -0.318568 2269.0 VCP A549.311 96h
XPR025_ES2.311_96H_L11_CDK4 -0.0242 0.8028 -0.316830 2913.0 CDK4 ES2.311 96h
XPR027_A549.311_96H_I04_WDR12 -0.0215 0.8340 -0.261806 2362.0 WDR12 A549.311 96h
XPRJJ001_A375_96H_A04_AURKB -0.0216 0.8421 -0.247883 2180.0 AURKB A375 96h
XPR018_A375.311_96H_G07_IFNA21 -0.0227 0.8550 -0.226049 1952.0 IFNA21 A375.311 96h
XPR033_A549.311_96H_C21_GJA5 -0.0214 0.8918 -0.165213 3096.0 GJA5 A549.311 96h
XPR009_MCF7.311_96H_K15_HTR2A -0.0292 0.8952 -0.159702 2549.0 HTR2A MCF7.311 96h
XPR012_MCF7.311_96H_M19_ART3 -0.0243 0.8964 -0.157748 1891.0 ART3 MCF7.311 96h
Top Chemical Perturbation signatures where ORC6 is up-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
ISO001_MCF10A.EGFR.HL_24H_L14_MK-2206_3.33uM 0.0241 1.5235 0.607424 1527.0 MK-2206 3.33uM MCF10A.EGFR.HL 24h
ISO001_MCF10A.TP53.M_24H_D14_doxorubicin_3.33uM 0.0235 1.5198 0.603911 1125.0 doxorubicin 3.33uM MCF10A.TP53.M 24h
ISO001_MCF10A.TP53.M_24H_D02_doxorubicin_3.33uM 0.0297 1.4962 0.581339 1225.0 doxorubicin 3.33uM MCF10A.TP53.M 24h
CPC012_VCAP_24H_B23_treprostinil_10uM 0.0229 1.2506 0.322591 1174.0 treprostinil 10uM VCAP 24h
LJP007_HA1E_24H_L22_WZ-4002_0.37uM 0.0225 1.1719 0.228902 1426.0 WZ-4002 0.37uM HA1E 24h
LJP007_MCF7_24H_J24_doramapimod_0.04uM 0.0256 1.1661 0.221701 1889.0 doramapimod 0.04uM MCF7 24h
LJP007_MCF7_24H_L23_WZ-4002_0.12uM 0.0256 1.1626 0.217414 1863.0 WZ-4002 0.12uM MCF7 24h
CPC006_CORL23_6H_I13_quinine_10uM 0.0257 1.1277 0.173320 4407.0 quinine 10uM CORL23 6h
MOAR010_OVTOKO_24H_P17_bendroflumethiazide_3.33uM 0.0230 1.1089 0.149170 3218.0 bendroflumethiazide 3.33uM OVTOKO 24h
FIBR026_MCLF141SZ_6H_J23_nifedipine_4uM 0.0250 1.1034 0.141998 3428.0 nifedipine 4uM MCLF141SZ 6h
Top Chemical Perturbation signatures where ORC6 is down-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
REP.B012_MCF10A_24H_O20_cobimetinib_0.74uM -0.0290 0.4834 -1.048588 1353.0 cobimetinib 0.74uM MCF10A 24h
CPC007_A549_24H_C02_wortmannin_10uM -0.0311 0.5082 -0.976661 1466.0 wortmannin 10uM A549 24h
CRCGN005_A549_24H_K07_wortmannin_10uM -0.0250 0.5501 -0.862117 1542.0 wortmannin 10uM A549 24h
PBIOA013_MCF7_24H_O07_etoposide_10uM -0.0277 0.5672 -0.818022 1370.0 etoposide 10uM MCF7 24h
CPC005_A549_24H_A04_wortmannin_10uM -0.0246 0.5673 -0.817915 1754.0 wortmannin 10uM A549 24h
PBIOA015_MCF7_24H_O08_irinotecan_3.33uM -0.0272 0.5729 -0.803728 1634.0 irinotecan 3.33uM MCF7 24h
AICHI002_OCILY3_24H_K09_mitoxantrone_0.66uM -0.0268 0.5764 -0.794841 1815.0 mitoxantrone 0.66uM OCILY3 24h
CPC007_PC3_24H_C02_wortmannin_10uM -0.0254 0.5796 -0.786774 1306.0 wortmannin 10uM PC3 24h
PBIOA022_MCF7_24H_D22_topotecan_0.37uM -0.0243 0.5890 -0.763738 1656.0 topotecan 0.37uM MCF7 24h
REP.A009_MCF10A_24H_G24_TAK-733_0.04uM -0.0256 0.6083 -0.717247 1876.0 TAK-733 0.04uM MCF10A 24h

International Mouse Phenotyping Consortium (IMPC)

https://www.mousephenotype.org/

IMPC contains serves mouse phenotype information associated with gene markers. Its API is described here and allows us to identify phenotypes significantly associated with a gene.

persistence of hyaloid vascular systemdecreased circulating creatinine levelhyperactivityimpaired pupillary reflex012345678
Phenotype known to be associated with ORC6 from IMPC-logp(combined_stouffer_statistic)mp_term_name
/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py:397: RuntimeWarning: divide by zero encountered in log10

  result = getattr(ufunc, method)(*inputs, **kwargs)

GlyGen

https://www.glygen.org/

GlyGen collects extensive protein product information related to Glycans and permits accessing that information over their API.

/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:1043: InsecureRequestWarning:



Unverified HTTPS request is being made to host 'api.glygen.org'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings



No information for gene with identifier ORC6 found in GlyGen

exRNA

https://ldh.clinicalgenome.org/ldh/ui/

The exRNA Linked Data Hub (LDH) facilitates efficient access to collated information such as links and select data from different data sources, which are made available using RESTful APIs. Currently, LDH focuses on linking information about human genes and variants to support exRNA curation efforts.

We provide the gene symbol to exRNA and obtain the reported linked data. The query will produce a document with all associated regulatory element in the +/- 10kb range or overlapping the gene.

{
    "data": {11 items},
    "metadata": {1 item},
    "status": {2 items}
}

HuBMAP

https://hubmapconsortium.org/

The goal of the Human BioMolecular Atlas Program (HuBMAP) is to develop an open and global platform to map healthy cells in the human body.

The HuBMAP ASCT+B Data was processed and is served by Enrichr. This data can be used to associate genes with cell types.

No information for gene with identifier ORC6 found in HuBMAP ASCT+B

Metabolomics

https://metabolomicsworkbench.org/

The National Institutes of Health (NIH) Common Fund Metabolomics Program was developed with the goal of increasing national capacity in metabolomics by supporting the development of next generation technologies, providing training and mentoring opportunities, increasing the inventory and availability of high quality reference standards, and promoting data sharing and collaboration.

MetGENE identifies the pathways and reactions catalyzed by the given gene ORC6, its related metabolites and the studies in Metabolomics Workbench with data on such metabolites.


Secondary Information

Each DCC has assembled a large repository of knowledge besides the data directly collected by the data generation centers they coordinate. We can access this expanded knowledge as well.

IDG

https://druggablegenome.net/

Pharos

We query IDG's knowledge base of targets and their Disease associations through the Pharos API.

Breast cancerEAR, PATELLA, SHORT STATURE SYNDROMEIntellectual DisabilityMEIER-GORLIN SYNDROME 3Meier-Gorlin Syndrome 3Meier-Gorlin syndromeMeier-Gorlin syndrome 3Mesothelioma, malignantear, patella, short stature syndromeintellectual disabilityadrenocortical carcinomamalignant mesotheliomaintraductal papillary-mucinous carcinoma (IPMC)psoriasislung adenocarcinoma05101520253035
Disease known to be associated with ORC6 from IDG's Pharos-logp(combined_stouffer_statistic)name

Harmonizome

We query the Harmonizome API for associations with various biological entities in a standardized set of numerous omics datasets, as detailed here.

{
    "symbol": "ORC6",
    "synonyms": [1 item],
    "name": "origin recognition complex, subunit 6",
    "description": "The origin recognition complex (ORC) is a highly conserved six subunit protein complex essential for the initiation of the DNA replication in eukaryotic cells. Studies in yeast demonstrated that ORC binds specifically to origins of replication and serves as a platform for the assembly of additional initiation factors such as Cdc6 and Mcm proteins. The protein encoded by this gene is a subunit of the ORC complex. Gene silencing studies with small interfering RNA demonstrated that this protein plays an essential role in coordinating chromosome replication and segregation with cytokinesis. [provided by RefSeq, Oct 2010]",
    "ncbiEntrezGeneId": 23594,
    "ncbiEntrezGeneUrl": "http://www.ncbi.nlm.nih.gov/gene/23594",
    "proteins": [1 item],
    "hgncRootFamilies": [],
    "associations": [4897 items]
}
BRD-K80439500/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K20525312/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K30821056/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K40112492/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K79517375/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K46149455/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesMOLT-13/Sanger Dependency Map Cancer Cell Line ProteomicsBRD-K62288678/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K21620541/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesGTEx SmallIntestine 20-29 vs 50-59/GTEx Tissue-Specific Aging Signatures0246810BRD-K79228133/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K23162004/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesPrimary somatosensory area, upper limb, layer 1/Allen Brain Atlas Adult Mouse Brain Tissue Gene Expression ProfilesBICR6_UPPER_AERODIGESTIVE_TRACT/CCLE Cell Line ProteomicsSUDHL4_HAEMATOPOIETIC_AND_LYMPHOID_TISSUE/CCLE Cell Line ProteomicsSH4_SKIN/CCLE Cell Line ProteomicsThyroid/GTEx Tissue Gene Expression Profiles 2023SKLU1_LUNG/CCLE Cell Line ProteomicsHeart-hepatocyte/Tabula Sapiens Gene-Cell AssociationsJUN_01_164/KnockTF Gene Expression Profiles with Transcription Factor Perturbations
directionupdownSignificant associations with ORC6 in IDG's HarmonizomeabsoluteZscorenamenamedirection=downdirection=up

ARCHS4

https://maayanlab.cloud/archs4/

ARCHS4 has processed numerous GEO studies and also has Tissue expression data.

UnitProt

https://www.uniprot.org/

UniProt is a comprehensive database on protein function information. Their Proteins REST API, documented here, can be used for gene-centric queries.

https://www.ebi.ac.uk/proteins/api/genecentric?offset=0&size=100&gene=STAT3