Title: | Complete Works of William Shakespeare in Tidy Format |
---|---|
Description: | Provides R data structures for Shakespeare's complete works, as provided by Project Gutenberg <https:www.gutenberg.org/ebooks/100>. |
Authors: | Zane Billings [aut, cre] |
Maintainer: | Zane Billings <[email protected]> |
License: | GPL-3 |
Version: | 0.0.9 |
Built: | 2024-11-10 03:13:20 UTC |
Source: | https://github.com/wzbillings/bardr |
A dataframe containing the full text of all of the complete works of William Shakespeare, as provided by Project Gutenberg.
all_works_df
all_works_df
A data frame with 166340 rows and 4 variables:
short (or common) name of the work
the full contents of the work. Each line is ~70 characters
the complete name of the work, as listed
whether the work is poetry, history, comedy, or tragedy
http://www.gutenberg.org/files/100/100-0.txt
works <- bardr::all_works_df subset(works, works$genre == "History")
works <- bardr::all_works_df subset(works, works$genre == "History")
A list containing the full text of all of the complete works of William Shakespeare, as provided by Project Gutenberg.
all_works_list
all_works_list
A list with 44 elements, each one containing a character vector containing the full text of a work, given in the element name.
http://www.gutenberg.org/files/100/100-0.txt
The bardr package provides R data structures for all of William Shakespeare's works available in the Project Gutenberg ebook. The provided data are designed to seamlessly work in R without the hassle of data wrangling and cleaning, which has already been performed.
Inspired by the janeaustenr package by Julia Silge: see https://github.com/juliasilge/janeaustenr .
The complete works are available all at one time in two separate formats.
One is a named list, where each entry is a named character vector. The name of the vector is the name of the work, and the contents of the vector are lines of the associated text file (all lines are <= 70 characters).
The other is a data frame with a column for the name of the work (repeated as many times as there are lines of content) and a column for the content of the work, where each cell in the content column is one line of text.