GeoGenIE (Geographic-Genetic Inference Engine) is an advanced software tool designed to use deep learning for predicting geographic localities (latitude and longitude) based on genetic SNP data inputs. Built on PyTorch, GeoGenIE is specifically optimized to account for geographic sampling bias and the limited number of SNPs typically found in GT-seq panels. This software is a powerful resource for researchers who need to infer geographic origins from genetic data, providing a robust framework for handling complex datasets.
GeoGenIE not only excels in prediction accuracy compared to other existing software, but also offers a wealth of metrics and visualizations that allow users to thoroughly assess model performance. The software is highly user-friendly, ensuring that even those with limited experience in deep learning can effectively utilize its capabilities. GeoGenIE is also highly efficient, with optional parallelized bootstrapping across multiple CPU cores to speed up analysis. Users have extensive flexibility in modifying model parameters, visualization options, and preprocessing settings, all while adhering to standard best practices. GeoGenIE leverages vectorized Python modules such as numpy, scipy, scikit-learn, and pandas, ensuring both speed and reliability in processing.
SNPio is a cutting-edge software project aimed at creating a user-friendly, object-oriented Python API designed for reading, filtering, and converting various standard genomic data file formats. In the complex field of genomics, researchers often face challenges when dealing with diverse data formats and the need for efficient data manipulation tools. SNPio addresses these challenges by providing a streamlined interface that simplifies the management of genomic datasets, enabling users to focus on analysis and interpretation.
PG-SUI (Population Genomics - Supervised and Unsupervised Imputation) is an AI-based software framework developed to address the challenge of missing data in genomic SNP datasets, which is common in population genomics and can lead to biased results and reduced statistical power. PG-SUI employs both supervised and unsupervised machine learning models to accurately predict missing SNPs by analyzing non-linear patterns in existing data. The software is designed to enhance the accuracy of genomic studies, reduce the need for extensive data collection, and integrate seamlessly into bioinformatics pipelines, making it accessible to a broader range of researchers.
ClineHelpR is an intuitive R package designed to streamline the plotting and visualization of Bayesian Genomic Cline (BGC) analyses and results from the INTROGRESS R package. Tailored for researchers working with genomic clines, ClineHelpR simplifies the detection of cline outliers and facilitates the correlation of environmental data with genomic clines and hybrid indexes. With a suite of user-friendly functions, this package empowers users to generate clear, publication-ready visualizations, making complex data more accessible and enhancing the interpretation of genomic and environmental relationships in hybrid zones.