Part of the DataFest Workshop Series. R and the Tidyverse are a data-first coding language that enables reproducible workflows. Attendees will learn the fundamentals of R, the Tidyverse, how to wrangle data for analysis, and practice reporting Exploratory Data Analysis (EDA) with R Markdown.
Bulding on knowledge from earlier Rfun workshops, useRs will be introduced to web crawling and HTML parsing. In this introductory web scraping workshop, attendees will use the rvest package to deconstruct a target site into structured data by combining limited knowledge of HTML specifications with a very limited appreciation of the HTTP protocol along with basic Tidyverse-style iteration.
This workshop will explore the many different ethical issues that can arise with data management and sharing and strategies to address those issues to ensure that goals set by publishers and funders around reproducibility and reuse can be met. How are researchers expected to comply with data sharing policies and practices when they do not actually own the data or ensure disclosure protection for human participants? Likewise how can researchers ethically collect, handle, and share data from certain communities, such as Indeginous People?
Bulding on earlier Rfun workshops, exploit your knowledge of familir Tidyveres syntax to query remote databases via RStudio. Attendees will be introduced to the dBplyr package as an alternative to SQL database querying. Following a review of dplyr and an overview of Google BigQuery public datasets, attendees will practice querrying Google BigQuery public data.
Tableau is a software package that is increasingly popular for creating striking visualizations, such as charts and graphs, from tabular data. It also has an increasing number of capabilities to create maps. Source data can include native geospatial files (such as shapefiles or GeoJSON files), but also tabular data (such as CSV or Excel files) that include locational values, such as place names or coordinate data. This workshop will cover how to create maps in Tableau and on ways to manipulate the data and to effectively symbolize it on a map.
Poster sessions are an incredible opportunity to share our work with a broader audience, get feedback, and network with our peers, as well as potential employers, funders and collaborators. Our careers often depend on performing well in these exciting and often chaotic venues, but few of us are trained in graphic design and visual storytelling! In this talk, I will present some principles for creating an effective academic poster, and introduce you to a group critique process that should help you tell your story more clearly and stand out from the crowd.
The importance of reproducibility, replication, and transparency in the research endeavor is increasingly discussed in academia. This workshop will introduce foundational strategies that can increase the reproducibility of your work and present a potential end-to-end reproducible workflow using a suite of tools, including git, RStudio, Binder, and Zenodo. Configuration for the hands-on portion of the workshop will be sent to participants one week before the workshop.
The R language has became a popular option for working with geospatial data. Compared to traditional GIS software, the code-driven approach of R can be more reproducible and efficient. This workshop give participants the skills to perform geospatial workflows entirely within R. We will discuss how different types of geospatial data work in R, walk through examples of data operations, and explore common analysis methods for geospatial data.
While Adobe Illustrator is my preferred software for producing diagrams, PowerPoint is quite full-featured and a great option for those who don't have access to Illustrator. Plus, any skills you gain drawing diagrams will help you create better presentation slides! In this workshop I'll cover basic shape creation and manipulation in Microsoft PowerPoint, including ways to make your own icons with shape combinations.
This workshop will introduce data management practices for researchers to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented using examples that span disciplines. The second hour of this workshop will offer a mini "tour" of research data management tools including GitHub, LabArchives, OSF, and Tropy and provide a framework for considering how to assess data management tools for future adoption.