Creating an R data package and a fun dataset about my garden
I started vegetable gardening in 2004 and have been hooked ever since. During graduate school, while learning about split-plots and nested designs, I dreamed about experimenting in my own garden but never quite found the motivation.
This past summer I finally decided to at least collect some data from my garden. I did this for two reasons: 1. I was curious about how much food I produced. 2. I wanted to use the data in my Introductory Data Science course at Macalester College. I knew the data would be fairly simple and I liked that it would be a bit personal and give a way for me to connect with students. This felt especially important this year while I have been teaching remotely.
I could spend a lot of time talking about my garden, but I’ll keep it brief. I plan to eventually put a post in my “Non-R Stuff” area for those who are into gardening as much as they are into R. But, for now, I’ll just show some photos of early season vs. late season garden. Hopefully this helps illustrate why I fondly refer to it as the jungle garden. Truthfully, the garden at my old house was more of a jungle since it sprawled across my backyard more, but I think this one is still worthy of the name.
I am personally most proud of the brick paths - I laid each individual brick except for the occasional help from my kids and neighbor kids who I paid $.10/brick (thankfully they got bored quickly!). I also have to give my husband Chris credit for the fence and raised bed boxes, although I did make all the fence post holes (with a manual fence post digger!).
Before I could make a data package, I had to collect the data. This was both an awful and great experience. I think anyone who analyzes data should have a go at collecting their own data at least once. Even in this very small endeavor of mine, I learned a lot.
I took some time thinking about the data I might want to analyze and tried my best to set up spreadsheets to collect everything I thought I would need. I made four Google sheets:
I liked putting the data in Google sheets because I could use Jenny Bryan’s {googlesheets4} package to interact with the data as I was collecting it.
At the beginning the summer, I really enjoyed weighing all the harvests. It was exciting to see how much food we grew on our own! But, by mid-summer, I was already starting to get annoyed by having to collect the data. Do you notice there’s very little raspberry data? That’s because I ate it before I weighed it - usually while weeding the garden. I also noticed (just anecdotally) that I felt more pressure on myself to use all the food I harvested and not let any go to waste. So, I think my family and I did a good job eating even more of the veggies than usual and neighbors probably got annoyed with me asking them to please take some zucchinis off my hands.
After using the data I collected with two sections of my Introductory Data Science students, I decided it was time to package it up and share it with others. If you want to see examples of how I used the data, check out the tutorials on my course website.
I started by coming up with the name for the data package and making a hex sticker. Is this where you should start? Probably not, but maybe? It gave me the motivation I needed to see it through. I used the {hexSticker} package by Guangchuang Yu to create mine. Check out more about that here. I also ordered way more stickers than I’ll probably ever need via stickermule - if you want some, get in touch with me :)
The three resources I ended up using are:
rstudio4edu Chapter 12: Create a data package by Desirée De Leon and Alison Hill. I’m pretty sure that everything I learn about R lately is from these two. Although they are both experienced R users, they are also amazingly good at writing tutorials and how-to’s from the perspective of someone who has never done the task before. I loved the “Is this tutorial for you?” checklist at the top of the section. Right away I knew - yes! That is me! This is for me!
Writing an R Package from Scratch by Tomas Westlake, which is an update of Hilary Parker’s blog post on the same topic.
R Packages book - really just section 18.9 and for looking at a second set of screenshots of documentation to make sure I did it correctly. There is probably a lot more I should read about in here, which I hope to do someday.
Because there are already so many great resources out there, I am going to highlight how I used them, rather than make a new resource. I ended up using a combination of resources because I wasn’t able to follow any of the instructions exactly, without running into differences or errors. There is a good chance this is due to some mistakes I made along the way.
use_github()
function from {usethis} as recommended by both rstudio3edu and Westlake resources, but I couldn’t get it to work. So, instead, I followed the instructions in section 18.9 of R Packages: 1. Create a new GitHub repo with the same name as your package without a README. 2. After doing this, an instruction page opens that tells you to copy and paste some code into a shell - do that. The code I had to copy was this - yours will be similar but specific to your package and GitHub username:git remote add origin https://github.com/llendway/gardenR.git
git branch -M main
git push -u origin main
After I had it synched with GitHub, I used Parts 2-6 (Sections 12.4-12.8) of the rstudio4edu resource. I followed their steps exactly - they even remind you to commit!
Add a README. I wanted a nice page to welcome people. I followed Steps 8 & 9 of Writing an R Package from Scratch to help me create a nice README file that includes the hex sticker logo :)
That’s what I plan to do very soon on Twitter. If you are reading this, it probably means I did it. Check out the {gardenR} package repo here.
For attribution, please cite this work as
Lendway (2020, Dec. 31). Lisa Lendway: Welcome to the Jungle ... Garden. Retrieved from https://lisalendway.netlify.app/posts/2020-12-30-welcometothejungle/
BibTeX citation
@misc{lendway2020welcome, author = {Lendway, Lisa}, title = {Lisa Lendway: Welcome to the Jungle ... Garden}, url = {https://lisalendway.netlify.app/posts/2020-12-30-welcometothejungle/}, year = {2020} }