Skip to main content

Table 1 Summary and comparison of DTI prediction methods for identification interactions relative to our presented framework

From: An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing

Paper

Drug feature and protein feature

Method for negative samples

Description

Method

DTI-SNNFRA [30]

(2021)

Drug: constitutional, topological, and geometrical descriptors.

Protein: amino acid, pseudoamino acid, and CTD

First is the similarity between the drugs and the proteins. Then, the shared nearest neighbors and k-medoids clustering

First, the similarity between the drugs and the proteins. Then, the shared nearest neighbors and k-medoids clustered using the RUSBoost classifier for the prediction stage.

1. Shared nearest neighbors

2. RUSBoost Classifier

DeepCon [31]

(2019)

Drug: Morgan fingerprint

Protein: CNN on raw protein sequence, CTD

Dependent on the similarity between the drugs and the proteins; then compute the distance between the drug and protein.

First compute the distance depending on the similarity of drug and target features for predict the negative samples to achieved the class balance, second apply to DBN for prediction stage.

1. The similarity of drug and target features

2. Deep belief network (DBN)

Idti-MLKdr [32]

(2021)

Drug: Morgan fingerprint

Drug: AAC, DC, TC

evaluate the molecular similarity of drug and target features based on the Tanimoto coefficient (TC). Then, the Cluster-Based Molecular Similarity algorithm calculates and selects the top-ranked drugs and targets.

The Tanimoto coefficient (TC) depends on the similarity between the drugs and between the proteins. Then, use Cluster algorithm and finally using

Multikernel learning (MKL).

1. Cluster algorithm

2. Multikernel learning (MKL)

PreDTIs [33]

(2021)

Drug: drug-molecular substructure pattern fingerprint

Protein: Psepssm

Using the SVM classifier. Then, the Euclidean distance is calculated from the predicted and the value of the real features

Use the SVM classifier. Then, calculate the Euclidean distance between the real and predicted values, using the LightGBM for prediction.

1. Euclidean distance

2. LightGBM Classifier

[20] (2020)

Drug: molecular substructure fingerprints Protein: Apply the PSSM, and then, apply the LOOP method to extract protein feature

Randomly select the number of negative samples, which is the same as the number of positive samples.

Randomly select the negative samples, equal to the positive samples.

Apply the rotation forest for prediction.

1. Rotation forest

[35] (2020)

Drug: Morgan fingerprint.

Protein: 20 amino acids

The negative sample sets consist of the same number of randomly selected pairs of unrelated drugs and proteins.

Randomly select the negative samples.

Apply Random Forest for prediction.

1. Random

Forest classifier

[16] (2017)

Drug: molecular descriptors and molecular

fingerprints (MFs).

Protein: AAC, DC, and TC

The negative dataset can be randomly selected from the DTS.

Random select the negative samples.

Apply the deep belief network for prediction

1. Deep belief network (DBN)

[34] (2020)

Drug: (E-state) fingerprints

Protein: (APAAC)

The Euclidean distance from all unlabeled samples to the positive center is calculated and sorted. The farther the distance is, the more likely the sample is to be negative.

The Euclidean distance from all unlabeled samples to the positive center.

Apply support vector machines (SVM) for prediction.

1. Euclidean distance

2. Support vector machines (SVM)