Schemas are defined in code, not SQL. ORM models are generated dynamically. MySQL tables are created and seeded on demand. Docker/CI builds can bootstrap databases automatically on startup. Queries automatically adapt to MySQL vs. SQLite and dataset-specific differences.#288
Closed
rmobmina wants to merge 14 commits intoBioAnalyticResource:devfrom
Closed
Schemas are defined in code, not SQL. ORM models are generated dynamically. MySQL tables are created and seeded on demand. Docker/CI builds can bootstrap databases automatically on startup. Queries automatically adapt to MySQL vs. SQLite and dataset-specific differences.#288rmobmina wants to merge 14 commits intoBioAnalyticResource:devfrom
rmobmina wants to merge 14 commits intoBioAnalyticResource:devfrom
Conversation
asherpasha
reviewed
Dec 4, 2025
api/__init__.py.save
Outdated
Collaborator
There was a problem hiding this comment.
Why do we have this file?
asherpasha
reviewed
Dec 4, 2025
vendor/flask_sqlacodegen/LICENSE
Outdated
Collaborator
There was a problem hiding this comment.
Do we need this whole vendor directory?
Collaborator
|
GitHub actions need to pass. |
Contributor
|
@asherpasha Reena updated GH workflow yml files from v2 to v3, small minor changes. Can you see and approve if it's ok? |
- Replace manual REGEX with BARUtils validators for all species - Add descriptive error messages for each species (e.g., "Invalid Cannabis gene ID") - Add comprehensive Sphinx/reST docstrings to all ORM files - Improve validation logic to handle both AGI and probeset IDs correctly - Fix SQL injection vulnerability by validating schema identifiers - Add identifier validation (alphanumeric + underscore only) before SQL construction Security: Addresses CodeQL high-severity SQL injection alert by validating all database/table/column identifiers match safe pattern before use in queries
- Fix trailing whitespace and blank line issues in efp_proxy.py - Fix missing blank lines in microarray_gene_expression.py - Add noqa comments for E402 in test_efp_data.py (imports after sys.path modification) - Remove unused variable retry_url - Fix spacing after commas in long list - Fix indentation issues
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
api/models/efp_schemas.py
Defines every “simple” eFP schema once in Python: columns, types, defaults, indexes, seed rows, metadata.
This is the single source of truth.
Add or edit schema entries here to support new datasets. No SQL files required.
api/models/efp_dynamic.py
Generates SQLAlchemy ORM models dynamically at import time from the schema definitions.
Each bind key (e.g., cannabis, dna_damage) becomes a mapped model class.
The app keeps full ORM functionality without writing model files to disk.
api/services/efp_bootstrap.py
Builds real MySQL databases and tables using SQLAlchemy DDL helpers.
Creates databases.
Creates tables.
Optionally inserts seed rows.
All structure is derived from the schema definitions in memory—no static .sql dumps.
scripts/bootstrap_simple_efp_dbs.py
CLI wrapper that bootstraps the simple eFP databases.
Can target all or selected databases:
python scripts/bootstrap_simple_efp_dbs.py --databases cannabis dna_damage embryo
Useful for Docker init scripts or CI setup.
config/init.sh
Runs the bootstrap script before loading any legacy dumps.
This ensures simple eFP databases are created and seeded during container startup.
api/services/efp_data.py
Unified query engine for both dynamic and legacy datasets.
Handles:
- bind key resolution
- MySQL vs SQLite fallback
- AGI → probeset mapping
- sample filtering
- schema-specific column differences
All eFP datasets now follow a single query path.
api/resources/efp_proxy.py
Public endpoints exposing the dynamic system:
GET /efp_proxy/expression → uses query_efp_database_dynamic
POST /efp_proxy/bootstrap/simple → triggers the bootstrap process over HTTP