Fig. 4
From: Genomic benchmarks: a collection of datasets for genomic sequence classification

Python code for downloading and acessing the dataset as a raw text files. First, we download dataset to our local machine and then we sequentialy read all files and store the samples in a dictionary. A full example can be found at https://github.com/ML-Bioinfo-CEITEC/genomic_benchmarks/blob/main/notebooks/How_To_Train_BERT_Classifier_With_HF.ipynb