CFDE Gene-Centric Appyter: KRT2

Given the gene KRT2, we request information about it from several different DCCs in hopes of creating a comprehensive knowledge report for it.

MyGeneInfo: Query

https://mygene.info/

To interoperate with different APIs which support different gene identifier schemes. We'll first use mygene.info to resolve gene identifiers.

{
    "took": 10,
    "total": 233,
    "max_score": 97.44715,
    "hits": [10 items]
}

GeneID: 3849

MyGeneInfo

https://mygene.info/

With the Entrez Gene ID, we can resolve lots of different identifiers and identifiability information from mygene.info.

{
    "AllianceGenome": "6439",
    "HGNC": "6439",
    "MIM": "600194",
    "_id": "3849",
    "_version": 2,
    "accession": {4 items},
    "alias": [5 items],
    "ensembl": {5 items},
    "entrezgene": "3849",
    "exac": {9 items},
    "exons": [1 item],
    "exons_hg19": [1 item],
    "generif": [7 items],
    "genomic_pos": {5 items},
    "genomic_pos_hg19": {4 items},
    "go": {3 items},
    "homologene": {2 items},
    "interpro": [4 items],
    "ipi": "IPI00021304",
    "map_location": "12q13.13",
    "name": "keratin 2",
    "other_names": [7 items],
    "pantherdb": {4 items},
    "pathway": {1 item},
    "pfam": [2 items],
    "pharmgkb": "PA30227",
    "pharos": {2 items},
    "pir": "A44861",
    "prosite": "PS51842",
    "reagent": {3 items},
    "refseq": {4 items},
    "reporter": {6 items},
    "summary": "The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. [provided by RefSeq, Jul 2008].",
    "symbol": "KRT2",
    "taxid": 9606,
    "type_of_gene": "protein-coding",
    "umls": {1 item},
    "unigene": "Hs.707",
    "uniprot": {1 item},
    "wikipedia": {1 item}
}

Gene Symbol: KRT2


Primary Information

We query DCC APIs to gain insights about the primary information they collect.

GTEx

https://gtexportal.org/home/

We query the GTEx Data through the GTEx API to identify tissue sites that significantly express the gene question.

Gene with identifier KRT2 currently not available in GTEx
Could not process GTEx output

LINCS

https://lincsproject.org/

L1000 RNAseq Gene Centric Signature Reverse Search (RGCSRS)

An appyter was built for performing Gene Centric signature reverse searches against the LINCS data. Its functionality is repeated here.

BokehJS 2.4.2 successfully loaded.
Top CRISPR KO signatures where KRT2 is up-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
XPR012_MCF7.311_96H_B04_HADH 0.0271 inf 0.000000 9670.0 HADH MCF7.311 96h
XPR027_ES2.311_96H_P04_ZNF184 0.0112 694.3725 9.439566 142.0 ZNF184 ES2.311 96h
XPR027_ES2.311_96H_F05_ZBTB46 -0.0082 203.2417 7.667053 1161.0 ZBTB46 ES2.311 96h
ZTO.XPR001_U937_408H_E13_NIPBL 0.0202 1.4636 0.549476 322.0 NIPBL U937 408h
ZTO.XPR001_TF1_408H_E09_SMC3 0.0146 1.2050 0.269031 953.0 SMC3 TF1 408h
Top CRISPR KO signatures where KRT2 is down-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
XPR033_U251MG.311_96H_J24_POLE3 0.0187 0.0000 0.0 12466.5 POLE3 U251MG.311 96h
XPR033_U251MG.311_96H_J23_CDK7 -0.0100 0.0000 0.0 12012.0 CDK7 U251MG.311 96h
XPR033_U251MG.311_96H_D07_KCNT2 -0.0097 0.0000 0.0 12620.0 KCNT2 U251MG.311 96h
XPR033_U251MG.311_96H_H11_KRT12 -0.0094 0.0000 0.0 13157.0 KRT12 U251MG.311 96h
XPR033_U251MG.311_96H_B16_SLCO1B3 -0.0093 0.0000 0.0 12447.0 SLCO1B3 U251MG.311 96h
XPR033_U251MG.311_96H_A15_NDUFB8 -0.0092 0.0000 0.0 13208.5 NDUFB8 U251MG.311 96h
XPR033_U251MG.311_96H_F04_LYZ -0.0091 0.0000 0.0 13462.0 LYZ U251MG.311 96h
XPR012_MCF7.311_96H_B13_BRD7 -0.0090 0.0000 0.0 12302.0 BRD7 MCF7.311 96h
XPR033_U251MG.311_96H_B03_CCNA2 -0.0090 0.0000 0.0 12986.0 CCNA2 U251MG.311 96h
XPR035_MCF7.311_96H_A15_CSNK1D -0.0088 0.0000 0.0 11838.5 CSNK1D MCF7.311 96h
Top Chemical Perturbation signatures where KRT2 is up-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
CPC020_HCC515_6H_H18_proxymetacaine_10uM 0.0249 inf 0.000000 10139.5 proxymetacaine 10uM HCC515 6h
REP.A004_HELA_24H_M10_camicinal_0.37uM 0.0258 2993.8740 11.547798 13.0 camicinal 0.37uM HELA 24h
CPD002_PC3_6H_B04_vorinostat_10uM 0.0283 14.1684 3.824610 140.0 vorinostat 10uM PC3 6h
REP.A004_HA1E_24H_I18_tafamidis-meglumine_0.04uM -0.0246 6.1869 2.629215 1200.0 tafamidis-meglumine 0.04uM HA1E 24h
FIBR016_MCLF022CN_6H_F09_valproic-acid_625uM 0.0221 5.6769 2.505094 49.0 valproic-acid 625uM MCLF022CN 6h
CPD002_PC3_6H_E18_acenocoumarol_10uM 0.0282 4.7948 2.261467 57.0 acenocoumarol 10uM PC3 6h
CPD002_PC3_6H_I06_thiostrepton_10uM 0.0219 4.2294 2.080459 154.0 thiostrepton 10uM PC3 6h
CPD002_PC3_6H_N05_amisulpride_10uM 0.0214 4.0538 2.019268 64.0 amisulpride 10uM PC3 6h
CPD002_PC3_6H_B12_flurandrenolide_10uM 0.0224 2.9988 1.584384 74.0 flurandrenolide 10uM PC3 6h
CPD002_PC3_6H_B07_mestinon_10uM 0.0232 2.9532 1.562255 91.0 mestinon 10uM PC3 6h
Top Chemical Perturbation signatures where KRT2 is down-regulated (based on fold change)
CD Coefficient Fold Change Log2(Fold Change) Rank in Signature Perturbagen Dose Cell Line Timepoint
Signature
LJP002_MCF10A_24H_E09_SB-525334_10uM 0.0273 0.0000 0.0 13090.5 SB-525334 10uM MCF10A 24h
CPD002_PC3_6H_C08_omeprazole_10uM -0.0256 0.0000 0.0 9467.5 omeprazole 10uM PC3 6h
FIBR016_MCLF022CN_6H_P04_PP-110_4uM -0.0241 0.0000 0.0 11782.0 PP-110 4uM MCLF022CN 6h
CPD002_PC3_6H_J18_estropipate_10uM -0.0265 0.0000 0.0 15162.0 estropipate 10uM PC3 6h
CPD002_PC3_6H_J21_hydroxychloroquine_10uM -0.0290 0.0000 0.0 12506.5 hydroxychloroquine 10uM PC3 6h
LJP002_MCF10A_24H_H18_TPCA-1_0.08uM -0.0234 0.0000 0.0 14197.5 TPCA-1 0.08uM MCF10A 24h
REP.A004_HA1E_24H_I03_ABT-239_1.11uM -0.0268 0.0000 0.0 9921.0 ABT-239 1.11uM HA1E 24h
LJP002_MCF10A_24H_J21_Y-39983_0.37uM -0.0229 0.0000 0.0 15720.0 Y-39983 0.37uM MCF10A 24h
REP.A013_JURKAT_24H_C06_triamcinolone_0.04uM -0.0225 0.0000 0.0 14132.5 triamcinolone 0.04uM JURKAT 24h
REP.A013_JURKAT_24H_D16_paroxetine_0.37uM -0.0270 0.0000 0.0 12251.5 paroxetine 0.37uM JURKAT 24h

International Mouse Phenotyping Consortium (IMPC)

https://www.mousephenotype.org/

IMPC contains serves mouse phenotype information associated with gene markers. Its API is described here and allows us to identify phenotypes significantly associated with a gene.

decreased grip strength00.511.522.533.54
Phenotype known to be associated with KRT2 from IMPC-logp(combined_stouffer_statistic)mp_term_name
/usr/local/lib/python3.8/dist-packages/pandas/core/arraylike.py:397: RuntimeWarning: divide by zero encountered in log10

  result = getattr(ufunc, method)(*inputs, **kwargs)

GlyGen

https://www.glygen.org/

GlyGen collects extensive protein product information related to Glycans and permits accessing that information over their API.

/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:1043: InsecureRequestWarning:



Unverified HTTPS request is being made to host 'api.glygen.org'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings



No information for gene with identifier KRT2 found in GlyGen

exRNA

https://ldh.clinicalgenome.org/ldh/ui/

The exRNA Linked Data Hub (LDH) facilitates efficient access to collated information such as links and select data from different data sources, which are made available using RESTful APIs. Currently, LDH focuses on linking information about human genes and variants to support exRNA curation efforts.

We provide the gene symbol to exRNA and obtain the reported linked data. The query will produce a document with all associated regulatory element in the +/- 10kb range or overlapping the gene.

{
    "data": {11 items},
    "metadata": {1 item},
    "status": {2 items}
}

HuBMAP

https://hubmapconsortium.org/

The goal of the Human BioMolecular Atlas Program (HuBMAP) is to develop an open and global platform to map healthy cells in the human body.

The HuBMAP ASCT+B Data was processed and is served by Enrichr. This data can be used to associate genes with cell types.

No information for gene with identifier KRT2 found in HuBMAP ASCT+B

Metabolomics

https://metabolomicsworkbench.org/

The National Institutes of Health (NIH) Common Fund Metabolomics Program was developed with the goal of increasing national capacity in metabolomics by supporting the development of next generation technologies, providing training and mentoring opportunities, increasing the inventory and availability of high quality reference standards, and promoting data sharing and collaboration.

MetGENE identifies the pathways and reactions catalyzed by the given gene KRT2, its related metabolites and the studies in Metabolomics Workbench with data on such metabolites.


Secondary Information

Each DCC has assembled a large repository of knowledge besides the data directly collected by the data generation centers they coordinate. We can access this expanded knowledge as well.

IDG

https://druggablegenome.net/

Pharos

We query IDG's knowledge base of targets and their Disease associations through the Pharos API.

Bullous ichthyosiform erythrodermaContact hypersensitivityovarian cancerichthyosis bullosa of Siemensichthyosiscontact dermatitisPeeling Skin SyndromePEELING SKIN SYNDROMEOvarian cancerOvarian CancerIchthyosis bullosa of SiemensIchthyosis Bullosa of SiemensIchthyosisIchthyosesICHTHYOSIS EXFOLIATIVAHyperkeratosis, EpidermolyticEpidermolytic hyperkeratosisDermatitis, Contactpeeling skin syndromecutaneous lupus erythematosusmedulloblastoma, large-cell0510152025
Disease known to be associated with KRT2 from IDG's Pharos-logp(combined_stouffer_statistic)name

Harmonizome

We query the Harmonizome API for associations with various biological entities in a standardized set of numerous omics datasets, as detailed here.

{
    "symbol": "KRT2",
    "synonyms": [5 items],
    "name": "keratin 2, type II",
    "description": "The protein encoded by this gene is a member of the keratin gene family. The type II cytokeratins consist of basic or neutral proteins which are arranged in pairs of heterotypic keratin chains coexpressed during differentiation of simple and stratified epithelial tissues. This type II cytokeratin is expressed largely in the upper spinous layer of epidermal keratinocytes and mutations in this gene have been associated with bullous congenital ichthyosiform erythroderma. The type II cytokeratins are clustered in a region of chromosome 12q12-q13. [provided by RefSeq, Jul 2008]",
    "ncbiEntrezGeneId": 3849,
    "ncbiEntrezGeneUrl": "http://www.ncbi.nlm.nih.gov/gene/3849",
    "proteins": [1 item],
    "hgncRootFamilies": [1 item],
    "associations": [4300 items]
}
BRD-K55479099/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K25907192/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K87275815/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K85275903/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K91654198/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K69659917/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesBRD-K47962810/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesSBC-1/Sanger Dependency Map Cancer Cell Line ProteomicsBRD-K63856509/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesOV-90/Sanger Dependency Map Cancer Cell Line Proteomics02468101214BRD-K95591629/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesASPC1/DepMap CRISPR Gene DependencyCCC5/DepMap CRISPR Gene DependencyBRD-K87176777/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesC32/DepMap CRISPR Gene DependencyCOV504/DepMap CRISPR Gene DependencyBRD-K21662756/LINCS L1000 CMAP Chemical Perturbation Consensus SignaturesMEL270/DepMap CRISPR Gene DependencyBrain - Cerebellar Hemisphere/GTEx Tissue Gene Expression Profiles 2023HG3/DepMap CRISPR Gene Dependency
directionupdownSignificant associations with KRT2 in IDG's HarmonizomeabsoluteZscorenamenamedirection=downdirection=up

ARCHS4

https://maayanlab.cloud/archs4/

ARCHS4 has processed numerous GEO studies and also has Tissue expression data.

UnitProt

https://www.uniprot.org/

UniProt is a comprehensive database on protein function information. Their Proteins REST API, documented here, can be used for gene-centric queries.

https://www.ebi.ac.uk/proteins/api/genecentric?offset=0&size=100&gene=STAT3