Cornell NLVR

The data is split into training, development, and two test sets. The first test set is public and available with the data, the second will not be released. The leaderboard shows accuracy for the development and public test sets, as well as accuracy and consistency for the unreleased test set (Test-U). The ranking in the leaderboards below is based on accuracy on the unreleased test set.