menu SweetData.io
close Home What is SweetData.io? Sign Up Login

Jester Collaborative Filtering DatasetJester Collaborative Filtering Dataset

aakaashjois 3 months ago 1.0.0 FREE
Download this dataset

Files

User Ratings.csv 55MB Joke Text.csv 32KB

License

CC BY-NC-SA 4.0
### Context The funniness of joke is very subjective. Having more than 70,000 users rate jokes, can an algorithm be written to identify the universally funny joke? ### Content - The data file are in **.csv** format. - The complete dataset is 100 rows and 73422 columns. - The complete dataset is split into 3 **.csv** files. - **JokeText.csv** contains the Id of the joke and the complete joke string. - **UserRatings1.csv** contains the ratings provided by the first 36710 users. - **UserRatings2.csv** contains the ratings provided by the last 36711 users. - The dataset is arranged such that the initial users have rated higher number of jokes than the later users. - The rating is a real value between **-10.0** and **+10.0**. - The **empty values** indicate that the user has not provided any rating for that particular joke. ### Acknowledgements The dataset is associated with the below research paper. [Eigentaste: A Constant Time Collaborative Filtering Algorithm.](http://www.ieor.berkeley.edu/~goldberg/pubs/eigentaste.pdf) Ken Goldberg, Theresa Roeder, Dhruv Gupta, and Chris Perkins. Information Retrieval, 4(2), 133-151. July 2001. More information and datasets can be found at [http://eigentaste.berkeley.edu/dataset/](http://eigentaste.berkeley.edu/dataset/) ### Inspiration Since funniness is a very subjective matter, it will be very interesting to see if data science can bring out the details on what makes something funny.

Datasets

Joke Textjester-collaborative-filtering-dataset/joke-text

JokeId JokeText

User Ratingsjester-collaborative-filtering-dataset/user-ratings

JokeId UserId Rating

Login

OR Create an Account