when random_state set to an integer, train_test_split will return same results for each execution.
when random_state set to an None, train_test_split will return different results for each execution.
see below example:
from sklearn.model_selection import train_test_split
X_data = range(10)
y_data = range(10)
for i in range(5):
X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size = 0.3,random_state = 0) # zero or any other integer
print(y_test)
print("*"*30)
for i in range(5):
X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size = 0.3,random_state = None)
print(y_test)
Output:
[2, 8, 4]
[2, 8, 4]
[2, 8, 4]
[2, 8, 4]
[2, 8, 4]
[4, 7, 6]
[4, 3, 7]
[8, 1, 4]
[9, 5, 8]
[6, 4, 5]