Bulding on knowledge from earlier Rfun workshops, learn basic text mining techniques with RStudio and critical packages. Attendees will analyze public domain novels by Jane Austen, wrangle text-data into submission, tokenize corpi, generate word clouds, and be introduced to introductory sentiment analysis.
Part of the DataFest Workshop Series. R markdown is a means of applying structure to the prose in your coding document. Using R Markdown and the xaringan package, attendees will integrate code with natural language (i.e. prose) to render presentation slides as a reproducible report. The method of integrating code with prose is known as literate coding. Attendees will focus their energies on rendering slides as one type of report while being introduced to an array of report types that can be generated from the same code.
Part of the DataFest Workshop Series. This presentation will focus on strategies for developing a short presentation that summarizes a data science project, including: identifying a compelling story in the analysis; leading with the key takeaways; and presenting results simply, effectively, and visually. Communicating with stakeholders is a core process in any data science project. Attendees will learn to construct a visually effective and time-efficient presentation for sharing data science results with your stakeholders to maximize the time you have with them.
In this workshop participants will learn strategies for how to prepare data for publishing by "curating" an example dataset and identifying common data issues. Participants will also learn about the overall role of repositories within the data sharing landscape and apply strategies for locating and assessing repositories. The workshop will include short lectures and group work via break-out rooms.
Part of the DataFest Workshop Series. Tableau Public (available for both Windows and Mac) is incredibly useful free software that allows individuals to quickly and easily explore their data with a wide variety of visual representations, as well as create interactive web-based visualization dashboards. This workshop will focus on using Tableau Public to create data visualizations, starting with an overview of how the program thinks about data, common data manipulation and loading, and the terminology used.
Bulding on knowledge from earlier Rfun workshops, learn how to gather and analyze tweets with the rvest package. Attendees will use R, RStudio, and the Tidyverse to orchestrate Twitter data gathering via the Twitter API.
Part of the DataFest Workshop Series. In this workshop we will focus on ggplot2, a library for R that creates clear and well-designed visualizations and that plays well with other tidyverse packages. Attendees will get up and running quickly with ggplot2, going through a variety of examples to learn how to understand, modify, and create ggplot2 visualizations. Building basic skills with visualization will improve your ability to create quick, exploratory visualizations for data analysis as well as more formal, outward-facing visualizations for presentations or publications.
There are many federal and private funders who require data management plans as part of a grant application, including NIH who recently released a new Data Management and Sharing Policy that takes effect in 2023 and will apply to all grants. This workshop will cover the components of a data management plan, what makes a strong plan and how to adhere to it, and where to find guidance, tools, resources, and assistance for building funder-based plans.
Part of the DataFest Workshop Series. R and the Tidyverse are a data-first coding language that enables reproducible workflows. Attendees will learn the fundamentals of R, the Tidyverse, how to wrangle data for analysis, and practice reporting Exploratory Data Analysis (EDA) with R Markdown.
Bulding on knowledge from earlier Rfun workshops, useRs will be introduced to web crawling and HTML parsing. In this introductory web scraping workshop, attendees will use the rvest package to deconstruct a target site into structured data by combining limited knowledge of HTML specifications with a very limited appreciation of the HTTP protocol along with basic Tidyverse-style iteration.