Logo

Getting Started

  • Installation
    • From PyPI (recommended)
    • Standalone Windows executable
    • From source (for development)
      • Building the .exe yourself
    • First-run setup
      • Workspace location
  • Quick Start
    • GUI in 5 clicks
    • CLI wizard
    • One-liner CLI examples

Scientific Workflow

  • The 7-Step Workflow
  • Step 1 — Promoter Extraction
    • Inputs
    • What it does
    • Output
    • CLI equivalent
  • Step 2 — Motif Search
    • Inputs
    • Statistics
    • Outputs
    • Gene-ID Resolution
  • Step 3 — Motif Logos
    • Inputs
    • Settings
    • Outputs
  • Step 4 — Expression Feeding
    • Inputs
    • Gene-ID Mapping Methods
    • Outputs
  • Step 5 — Co-expression Network
    • Inputs
    • Pipeline
    • Outputs
  • Step 6 — K-means Clustering
    • Inputs
    • Settings
    • Diagnostics
    • Outputs
  • Step 7 — KEGG Enrichment
    • Inputs
    • Statistics
    • REST endpoints used
    • Output
    • CLI equivalent

GUI

  • GUI Overview
    • Theme & shortcuts
  • NCBI Fetch Tab
    • Search strategies
    • Outputs
  • Contact Tab
    • Theming

CLI

  • CLI Overview
    • Conveniences
  • The Wizard
    • Sample session
  • Command Reference
    • cis-gs fetch
    • cis-gs extract
    • cis-gs search
    • cis-gs feed
    • cis-gs coexpr
    • cis-gs kmeans
    • cis-gs enrich-kegg
    • cis-gs id-convert

API Reference

  • Programmatic API
    • cis_gs.enrichment
      • cis_gs.enrichment.core
      • cis_gs.enrichment.kegg
      • cis_gs.enrichment.plots
    • cis_gs.enrichment.idmap
      • detect_id_type()
      • consensus_id_type()
      • IDMapping
      • IDConverter
      • search_ncbi_taxonomy()
    • Top-level package
  • cis_gs.enrichment
    • cis_gs.enrichment.core
      • bh_fdr()
      • fold_enrichment()
      • EnrichmentResult
      • hypergeometric_enrichment()
    • cis_gs.enrichment.kegg
      • KEGGClient
      • KEGGEnricher
    • cis_gs.enrichment.plots
      • Provenance
      • dot_plot()
      • bar_plot()
  • cis_gs.enrichment.idmap
    • detect_id_type()
    • consensus_id_type()
    • IDMapping
      • IDMapping.user_input
      • IDMapping.ensembl_gene_id
      • IDMapping.entrez_id
      • IDMapping.symbol
      • IDMapping.species
    • IDConverter
      • IDConverter.convert()
    • search_ncbi_taxonomy()

Project

  • Changelog
    • v1.1.0 — 2026-05
    • v1.0.0 — 2026-03
  • Contributing
    • Development setup
    • Running the tests
    • Building the docs locally
    • Coding conventions
    • Opening a pull request
    • Code of conduct
  • Citation
    • BibTeX
    • Acknowledgements
Cis-GS
  • Step 2 — Motif Search
  • Edit on GitHub
Previous Next

Step 2 — Motif Search

Scans every promoter for transcription-factor binding motifs and computes a hypergeometric over-representation p-value per motif.

Inputs

  • Target FASTA — usually the promoters.fa from Step 1.

  • Motifs — any combination of:

    • Free-text IUPAC consensus (one per line, NAME SEQ)

    • MEME file

    • Live import from PlantTFDB (157 species), AnimalTFDB (vertebrates + insects), JASPAR 2024, or HOCOMOCO v11.

Statistics

For each motif:

\[p = P(X \geq k) \;\text{where}\; X \sim \text{Hypergeom}(N, K, n)\]

with

\(N\)

Total number of promoters

\(K\)

Number of promoters in which the motif occurs at least once

\(n\)

Number of query promoters (e.g. a K-means cluster from Step 6)

\(k\)

Number of query promoters with a hit

Multiple-testing correction: Benjamini-Hochberg (cis_gs.enrichment.core.bh_fdr).

Outputs

  • hits.csv — one row per gene × motif, with hit position, strand, raw and adjusted p-value.

  • Significance Summary — collapsed table with one row per (gene × motif).

Gene-ID Resolution

Cis-GS adds three optional ID-mapping methods to bridge the common NCBI LOC### ↔ species-database mismatch:

  1. Column swap — append XM_ / XP_ accessions to the exported CSV.

  2. Mapping CSV — user-supplied two-column lookup.

  3. GFF3 Dbxref expansion — pull every synonym from Dbxref= and locus_tag= attributes in the annotation.

Previous Next

© Copyright 2026, Ayushman Mallick (Plant Signaling Lab, IISER Tirupati).

Built with Sphinx using a theme provided by Read the Docs.