Train Data and Test Data In DataScience

Data Science

I am eager to know about the data science and learning data science in python so I am confused with these two topics mainly test and train, in some programs the train and test dataset is separate and in some programs the train and test data is been combined. please suggest me.



Train data set: Train the data set refers to to the training of the model. In simple terms, train the data set is a subset of the data set. You also need to know that the training data set is larger than the test data set. Also, you need to know to use more data to train the model and learn better. 

Test the data set: Test the rate of it refers to testing the model that you have trained using the training data set. It training data set is also the subset of the data set. In simple terms, the data set is less when compared to the training data set. The primary responsibility of the test data set is to calculate the accuracy. Let's take an example that would help us explain about train data set and test data set. 

ex=ex.split(data,SplitRatio = 0.65)



In the above example, we have split the data set. The split ratio refers to the percentage of the data set that has to be split. In the above example, the split ratio is 65%, which means that 65% of the data set has the train data set and the remaining 35% of the data set is is the test data set.


If you want to unleash your potential in this competitive field, please visit the Data Science course page for more information, where you can find the Data Science tutorials and Data Science frequently asked interview questions and answers as well.


This topic has been locked/unapproved. No replies allowed

Login to participate in this discussion.

Leave a reply

Before proceeding, please check your email for a verification link. If you did not receive the email, click here to request another.