Part of the DataFest Workshop Series. This presentation will focus on strategies for developing a short presentation that summarizes a data science project, including: identifying a compelling story in the analysis; leading with the key takeaways; and presenting results simply, effectively, and visually. Communicating with stakeholders is a core process in any data science project. Attendees will learn to construct a visually effective and time-efficient presentation for sharing data science results with your stakeholders to maximize the time you have with them.
In this workshop participants will learn strategies for how to prepare data for publishing by "curating" an example dataset and identifying common data issues. Participants will also learn about the overall role of repositories within the data sharing landscape and apply strategies for locating and assessing repositories. The workshop will include short lectures and group work via break-out rooms.
Part of the DataFest Workshop Series. Tableau Public (available for both Windows and Mac) is incredibly useful free software that allows individuals to quickly and easily explore their data with a wide variety of visual representations, as well as create interactive web-based visualization dashboards. This workshop will focus on using Tableau Public to create data visualizations, starting with an overview of how the program thinks about data, common data manipulation and loading, and the terminology used.
Bulding on knowledge from earlier Rfun workshops, learn how to gather and analyze tweets with the rvest package. Attendees will use R, RStudio, and the Tidyverse to orchestrate Twitter data gathering via the Twitter API.
Part of the DataFest Workshop Series. In this workshop we will focus on ggplot2, a library for R that creates clear and well-designed visualizations and that plays well with other tidyverse packages. Attendees will get up and running quickly with ggplot2, going through a variety of examples to learn how to understand, modify, and create ggplot2 visualizations. Building basic skills with visualization will improve your ability to create quick, exploratory visualizations for data analysis as well as more formal, outward-facing visualizations for presentations or publications.
There are many federal and private funders who require data management plans as part of a grant application, including NIH who recently released a new Data Management and Sharing Policy that takes effect in 2023 and will apply to all grants. This workshop will cover the components of a data management plan, what makes a strong plan and how to adhere to it, and where to find guidance, tools, resources, and assistance for building funder-based plans.
Part of the DataFest Workshop Series. R and the Tidyverse are a data-first coding language that enables reproducible workflows. Attendees will learn the fundamentals of R, the Tidyverse, how to wrangle data for analysis, and practice reporting Exploratory Data Analysis (EDA) with R Markdown.
Bulding on knowledge from earlier Rfun workshops, useRs will be introduced to web crawling and HTML parsing. In this introductory web scraping workshop, attendees will use the rvest package to deconstruct a target site into structured data by combining limited knowledge of HTML specifications with a very limited appreciation of the HTTP protocol along with basic Tidyverse-style iteration.
This workshop will explore the many different ethical issues that can arise with data management and sharing and strategies to address those issues to ensure that goals set by publishers and funders around reproducibility and reuse can be met. How are researchers expected to comply with data sharing policies and practices when they do not actually own the data or ensure disclosure protection for human participants? Likewise how can researchers ethically collect, handle, and share data from certain communities, such as Indeginous People?
Bulding on earlier Rfun workshops, exploit your knowledge of familir Tidyveres syntax to query remote databases via RStudio. Attendees will be introduced to the dBplyr package as an alternative to SQL database querying. Following a review of dplyr and an overview of Google BigQuery public datasets, attendees will practice querrying Google BigQuery public data.