Supplementary Datasets
Dataset | Title | Files | Description |
Dataset S1 | Novacore compounds | NovaCore_compounds.sdf | This file contains all of the compounds from the NovaCore SAR library that were screened at 200uM in yeast. Compounds that were scored as active were marked with the property active set to 1. Inactive compounds have the same property set to 0. There were 49218 compounds in total, of which 4,965 were active. |
Dataset S2 | Chemdiv compounds |
|
This file contains all of the compounds from the ChemDiv Diverse library that were screened at 200 uM in yeast. Compounds that were scored as active were marked with the property active set to 1. Inactive compounds have the same property set to 0. There were 27,680 compounds in total, of which 2,120 were active |
Dataset S3 | Novacore Diverse compounds | NovaCore_Diverse_compounds.sdf | This file contains all of the compounds from the NovaCore Diverset library that were screened at 200 uM in yeast. Compounds that were scored as active were marked with the property active set to 1. Inactive compounds have the same property set to 0. There were 4,422 compounds in total, of which 391 were active. |
Dataset S4 | Spectrum Compounds | Spectrum_Compounds.sdf | This file contains all of the compounds from the Spectrum library that were screened at 50 uM in yeast. Compounds that were scored as active were marked with the property active set to 1. Inactive compounds have the same property set to 0. There were 1,998 compounds in total, of which 68 were active. |
Dataset S5 | S. pombe compounds | screened_pombe_compounds.sdf | This file contains all of the compounds that were screened S. pombe. Compounds that were scored as active in S. pombe were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 3,707 compounds were screened and 2,776 were active |
Dataset S6 | B. subtilis compounds | screened_bsub_compounds.sdf | This file contains all of the compounds that were screened B. subtilis. Compounds that were scored as active in B. subtilis were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 4,255 compounds were screened and 1,697 were active |
Dataset S7 | E. coli compounds | screened_ecoli_compounds.sdf | This file contains all of the compounds that were screened E. coli. Compounds that were scored as active in E. coli were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 4,852 compounds were screened and 247 were active |
Dataset S8 | Mammalian compounds | screened_mammalian_compounds.sdf | This file contains all of the compounds that were screened Human lung cancer cell line (A549). Compounds are defined to be active if the cell has a viability of less than 50% in the presence of the compound. In total 167 compounds were screened, and 116 were found to be active. |
Dataset S9 | C. elegans compounds | screened_celegans_compounds.sdf | This file contains all of the compounds that were screened in a C. elegans phenotype assay. Compounds that caused a gross phenotype were scored as active in C. elegans were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 5,899 compounds were screened and 809 were active |
Dataset S10 | C. albicans compounds | screened_calbicans.sdf | This file contains all of the compounds that were screened C. albicans. Compounds that were scored as active in C. albicans were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 835 compounds were screened and 130 were active |
Dataset S11 | C. neoformans compounds | screened_neoformans.sdf | This file contains all of the compounds that were screened C. neoformans. Compounds that were scored as active in C. neoformans were marked with the property active set to 1. Inactive compounds have the same property set to 0. In total 804 compounds were screened and 373 were active |
Dataset S12 | NIH Molecular Library Screening Network Compounds | MLSCR_bioavailability_filter_model.ta | This is a tab delimited file containing the entire Molecular Libraries screening collection, downloaded from PubChem. The final column is how the compound scored with Naive Bayes model for growth inhibition. The higher the score, the more likely the compound will inhibit yeast growth. A score of 1 in the second last column indicates the compound passes the bioavailability filter. |
Dataset S13 | Zinc Purchasable compounds | all_purchasable_compounds_zinc_6_4_10.tab | This is a tab delimited file containing all purchasable compounds as defined by the Zinc database. It contains a column indicating how many of the Lipinski parameters the molecule pass, aswell as a column indicating if it passes the 2-property filter and a score from the yeast model. As the yeast model was built on drug-like compounds that pass all 4 of Lipinski's criteria, these are the compounds that model is more likely to predict correctly. |
Dataset S14 | Raw and analyzed microarray data for the 20 compounds screened by HIP profiling | HIP_ratios.xlsx |
All raw data is available as Affymetrix cel files (version 4), containing coordinates and intensity values, in the file Celfiles_Archive.zip. The information needed to translate cel file information for tag4 arrays into orf_tag information can be found in the Chip_Spot_Tag4.txt file. The following tag types are present:affy expression repaired repaired-bad tag3 tag3-bad. For HIP profiling spots of type tag3 and repaired are used. The cel file descriptions (cel file name, compound, concentration value and units) can be found in the Celfile_index.xlsx. Compound sensitivity for all strains, including homozygous deletion strains, to each compound is available in the file HIP_ratios.xlsx. Log2 ratios were calculated as described in the Supplemental Methods. |
Dataset S15 | Validation of screening Library compounds | Mass-spec_Dalton_Pharma.pdf
|
30 reordered compounds and 30 additional compounds, chosen at random from four hitplates, were analyzed using liquid chromatography and mass spectrometry (LC-MS) to verify if the compound of interest was present. The data and methods are available in the file Mass-spec_Dalton_Pharma.pdf and Supplementary Methods. LC-MS data for the 60 compounds from the library and the 30 reordered compounds supplied by Chembridge at time of purchase are available in the following files, Chembridge Library data.zip and Chembridge_reorder_LC_MS.pdf |