When I first started training models after the FAST.ai course I wasn't sure how much it was necessary to actually test things. I still am not perfectly sure, but essentially copied the method used in this University of Texas paper where they hand-labeled 500 tweets that had been removed from the "training" and "validation" sets.
I did the same and was very pleased to find that while my model was 99% accurate on the validation set it was still 93.6% accurate on my hand-labeled test set.
I know I can improve my test set as well, but I also know that it's important to finish the project to the accuracy that is most useful.
Below is the code used for inference from the trained ULMfit twitter model:
data:image/s3,"s3://crabby-images/9aa10/9aa1070586040bbb065762dc0c1d479088c29a4d" alt=""
data:image/s3,"s3://crabby-images/e6147/e614780316d72800c0319d21e0eb5847bc66ef71" alt=""
data:image/s3,"s3://crabby-images/ec286/ec286e7447180c251540f89860ae252d511bdeb7" alt=""
data:image/s3,"s3://crabby-images/3e0af/3e0af004471af27e3a26e57aee7cba2eb8e24586" alt=""
Comments