abYbank / EMBLIG

Antibody data resources

EMBLIG

The EMBLIG data set, a compilation of entries selected from the EMBL database.

Latest File:

File EMBL Release Date Link
emblig_latest.tar.gz 20230625 22.01.2024 Download

EMBLIG Files:

File EMBL Release Date Link
emblig_20240122_r20230625.tar.gz 20230625 22.01.2024 Download
emblig_20220723_r20220531.tar.gz 20220531 23.07.2022 Download
emblig_20211101_r11Oct2021.tar.gz 11Oct2021 01.11.2021 Download
emblig_20210331_r143.tar.gz 143 31.03.2021 Download
emblig_20200211_r142.tar.gz 142 11.02.2020 Download
emblig_20190405_r138.tar.gz 138 05.04.2019 Download
emblig_20180710_r136.tar.gz 136 10.07.2018 Download
emblig_20180118_r134.tar.gz 134 18.01.2018 Download
emblig_20171017_r133.tar.gz 133 17.10.2017 Download
emblig_20170223_r130.tar.gz 130 23.02.2017 Download
emblig_20161013_r129.tar.gz 129 13.10.2016 Download

Information

The aim is to capture every sequence in the public domain that contains a light or heavy antibody variable domain, using a simple similarity protocol.

To avoid experimentally unverified sequences, pseudogenes etc. the selection is restricted to certain data classes and taxonomic divisions and requires that there is a protein translation.

Every protein translation downloaded from the EMBL release directory is used as a query sequence in a conservative BLAST search of three reference sets of known Ig variable domain sequences.

The reference sets are:

The overwhelming majority of queries (95%) that get a match in one set get a hit in all of them.

Entries with a translation that gets a match are collected and the protein identifiers of the translations are recorded.