Input indexes file(s)
The user must provide its available indexes as two-column tab-delimited text file(s) without header: index ids are in the first column and corresponding sequences in the second. An example of such a file is available into the GitHub repository and also here to test the application. Note that for dual-indexing sequencing experiments the first file corresponds to indexes 1 (i7) and the second file to indexes 2 (i5). Example files for dual-indexing are available here (index 1) and here (index 2).
How the algorithm works
There can be many combinations of indexes to check according to the number of input indexes and the multiplexing rate. Thus, testing for the compatibility of all the combinations may be long or even impossible. The trick is to find a partial solution with the desired number of pools/lanes but with fewer samples than asked and then to complete each pool/lane with some of the remaining indexes to reach the desired multiplexing rate. Indeed, adding indexes to a combination of compatible indexes will give a compatible combination for sure. Briefly, a lower number of samples per pool/lane generates a lower number of combinations to test and thus makes the research of a partial solution very fast. Adding some indexes to complete each pool/lane is fast too and gives the final solution.
Unfortunately, the research of a final solution might become impossible as the astuteness reduces the number of combinations of indexes. In such a case, one can look for a solution using directly the desired multiplexing rate (see parameters), the only risk is to increase the computational time.
Parameters
Illumina chemistry can be either four-channels (HiSeq & MiSeq), two-channels (NovaSeq, NextSeq & MiniSeq) or one-channel (iSeq 100). With the four-channel chemistry, a red laser detects A/C bases and a green laser detects G/T bases and the indexes are compatible if there is at least one red light and one green light at each position. With the two-channel chemistry, G bases have no color, A bases are orange, C bases are red and T bases are green and indexes are compatible if there is at least one color at each position. Note that indexes starting with GG are not compatible with the two-channel chemistry. With the one-channel chemistry, compatibility cannot be defined with colors and indexes are compatible if there is at least one A or C or T base at each position. Please refer to the Illumina documentation for more detailed information on the different chemistries.
Total number of samples in your experiment (can be greater than the number of available indexes).
Multiplexing rate i.e. number of samples per pool/lane (only divisors of the total number of samples are proposed).
i7 and i5 pairing (only for dual-indexing) is proposed if there are as many i5 as i7 indexes to deal with Illumina Unique Dual-Indexes (UDI). Note that the pairing is done using the order of the indexes in the input files.
Constraint on the indexes (only for single-indexing) to avoid having two samples or two pools/lanes with the same index(es).
Directly look for a solution with the desired multiplexing rate (only for single-indexing) instead of looking for a partial solution with a few samples per pool/lane and then add some of the remaining indexes to reach the desired multiplexing rate.
Select compatible indexes (only for single-indexing) before looking for a (partial) solution can take some time but then speed up the algorithm.
Maximum number of trials can be increased if a solution is difficult to find with the parameters chosen.
About
This application has been developed at the Biomics pole of the Institut Pasteur by Hugo Varet and an Application Note describing it has been published in 2018 in Bioinformatics. Feel free to send an e-mail to hugo.varet@pasteur.fr for any suggestion or bug report.
Source code and instructions to run it locally are available the GitHub repository. Please note that checkMyIndex is provided without any guarantees as to its accuracy.
Version
This website executes checkMyIndex version 1.0.2.