Basic Information

Symbol
SEMG1
RNA class
mRNA
Alias
Semenogelin 1 Cancer/Testis Antigen 103 Semenogelin I CT103 SEMG Semen Coagulating Protein Semenogelin-1 SGI DJ172H20.2 SgI-29
Location (GRCh38)
Forensic tag(s)
Tissue/body fluid identification

MANE select

Transcript ID
NM_003007.5
Sequence length
1622.0 nt
GC content
0.4118

Secondary Structure

Generated by RNAfold
Minimum free energy (MFE) structure:
Secondary structure that contributes a minimum of free energy.
Ensemble properties:
Thermodynamic properties of the Boltzmann ensemble.
Minimum free energy
-362.1 kcal/mol
Thermodynamic ensemble
Free energy: -392.89 kcal/mol
Frequency: 0.0000
Diversity: 445.26
MFE Structure Visualization
Structure Prediction
MFE Structure Prediction
......(.(((((((((.(((((((.(.....).))))))).....................))))))))).)((((((..(((((((((....((((((..(((.(((((((((((((..(((((........((((((((..(((....(((((........(((.((((.....((((......((((((..(....)..)))))).(((((...(((....)))...)))))))))..((........)))))).))).((((((......))))))....)))))......((........)).........)))..))))))))......(((((................))))).((((.(((....)))...))))...)))))..)))))))..........(((((((((.((((((((.((((.........((((.((.((((..(((......))))))).))))))((((((.......))))))......((((((((.(((.((((((((((.....(((.(((......)))..((((.((((((.(..................((..........))(((((((((((.((...)).)))))..)))))).............((((((........))))))........((((((..((((.....))))......(((.((((.....................)))).)))))))))).))))))))))..)))....)))))))))).)).).)))))))).)))).)))))))).(((((.............................((((......))))..........(((....)))...................(((((..((((.(((((.......(((((...)))))....(((.(((((......))))).)))...((((............))))..((((..((.....))...)))).......(((...........)))....((....))...(((....)))...................)))))..)))).))))).......(((((...)))))....(((.(((((......))))).)))...((((............))))....)))))..(((((...(((....(((((((.((((((((..((((.(((.(((((...(((((....(((......((.((((......((((....)))).....))))))..(((....)))........)))..))))).))))).)))((((((((.(((.((...............((((((.((.(((((((..((((.(((((((....((........)).......(((((.(((((((.....((((....))))...)))))))..)))))............))))))).)))).))).......)))).)).))))))..)).))).))))))))..))))..)))..))))))).))))))))....)))))...)))))))))......)))))).......)))..)))))).....))))).........))))...))))))..
Thermodynamic Ensemble Prediction
......({{,..||,{(((((((((.{.....}.))))))}}}}........{{{{.{{{...,,}))),)}}|((({(.{((({((((({.....((({{{((,...,{,,|||||||..(((,,....|{|.,{{(((,,..|||....,|||{...,,...,{{,||{{.....{{{,......{(((((..,....,,,)))))}.{{{{{...,,,....,}|..,)))}}}))}..}},}},,,,.,.)))),}},.((((((......))))))....}}}}}......,.........}}.....|,..|||..}|))}})},.....{{|||.............,,,|||}}.((((.(((....)))...))))...,,,,,.{||||},,.}}}.||...{{((((({,.((((((((.((((.........((((.((.((((..(((......)))))))})}))))((((((.......))))))......((((((((.{((.((((((((((.....{((..,{......}))..((((.((((((.(.....{{,,,....,..,({.....,,...}}{(((({(((((,{,...,).}))))..}})))).............((((((........))))))........{(((({..((({.....)))).,.,{{(((.{,{{.,,...........,,.,...})))})))))))))).))))))))))..,,,....)))))))))).)).).)))))))).)))).)))))))).||.,,......|,{{,,{{.....,,........((((......))))........}}(((....)))...................(((((.......))))}.......(((({...}))))....{((.(((((......))))).)))...({{{............}}))..({((..(({{.{{|,.,{{{{|,....,,{{..,,,..,,.,.||{....((....))...(((....)))...........,,{,,{,.(((((,,||..,||}|,.......|||||,,.))))),...|||}|||{{......|||}}.}}}..{{{{{,{,,.,.{{,..||||....||}}}}.{{,||,..((({....,,{|||}{(({{{((({..{{.{({,(((((...(((((....(((....,{{(.((((,.....((((....))))...,.})))},.}||,,,..}}........,)))..))))}.))))).}))|,|||||}}}||}||,||||.........{(,{{{,{(({(,,..,.}}|||,,||{||,,..,,||,,......|,,......(((((.(((((((.....((((....))))...)))))))..))))),{{........}))}}}}},})}},}|||,}}.,.)))},})}))}})),,|}},)),))))))))).,)))).)))}})))}}}}.}||,}},,.,,,}}))),..))))))))),..,{{{(||,,,,,},,,})}||})))))}....,},,.....|||{|,,..,,}))))))..

Transcripts

ID Sequence Length GC content
AGACAAGGUUUUCCAAGCAAGAUGAAGCCCAACAUCAUCUUUGUACUUU… 1622 nt 0.4118
Summary

The protein encoded by this gene is the predominant protein in semen. The encoded secreted protein is involved in the formation of a gel matrix that encases ejaculated spermatozoa. This preproprotein is proteolytically processed by the prostate-specific antigen (PSA) protease to generate multiple peptide products that exhibit distinct functions. One of these peptides, SgI-29, is an antimicrobial peptide with antibacterial activity. This proteolysis process also breaks down the gel matrix and allows the spermatozoa to move more freely. This gene and another similar semenogelin gene are present in a gene cluster on chromosome 20. [provided by RefSeq, Feb 2016]

Forensic Context

A study in humans demonstrated that the SEMG1 mRNA is a specific marker for semen identification, with a SHERLOCK-based detection system yielding significantly higher relative fluorescence unit values in 53 semen samples compared to 40 non-semen samples and negative controls (P < 0.0001) [Luo et al. DOI:10.1016/j.fsigen.2025.103410]. This rapid extraction and detection method, which targets the SEMG1 gene, achieved a sensitivity down to 0.25 µL of semen and reduced total analysis time to 70 minutes, providing a portable solution for on-site forensic body fluid identification. A literature review of proteomics in forensic science spanning 2004–2024 identifies the SEMG1 as a protein marker commonly found in seminal fluid for body fluid identification [Alex et al. DOI:10.1016/j.scijus.2025.101320].