Reliable Intelligence Identification on Vietnamese SNSs (ReINTEL) Forum

Go back to competition Back to thread list Post in this thread

> Wrong dtype in train data

in the public_train.csv file, id=82 there are wrong-labelled sample with: text in the num_comment_post column and relatively high value for the num_share_post
also: id=432 in warm up train set, id=5835 in public test set

Posted by: phanviethoang1512 @ Nov. 9, 2020, 4:34 a.m.

Hi,

Thank you for the feedback.

There are minor shifting errors for some data points. The long sequence of number should be in "timestamp" column. And the strings in the "likes", "comments" columns are parts of the post content.
Therefore, you can (1) combine parts of the post content , (2) paste the timestamp to its correct column and (3) treat values in "likes", "comments" and "share" columns as missing data.

We apologize for the inconvenience. Thank you for pointing out the error.

Regards,
ReINTEL Team

Posted by: reintel-organizers @ Nov. 9, 2020, 4:59 a.m.
Post in this thread