Updating search results...

Information Science

1265 affiliated resources

Search Resources

View
Selected filters:
Data Management Skillbuilding Hub - DataOne
Unrestricted Use
Public Domain
Rating
0.0 stars

The Data Management Skillbuilding Hub is a repository for open educational resources regarding data management, meaning that it is a collection of learning resources freely contributed by anyone willing to share them. Materials such as lessons, best practices, and videos, are stored in the DataONEorg GitHub repository as well as searchable through the Data Management Training Clearinghouse. We invite you submit your own educational resources so that the Data Management Skillbuilding Hub can remain an up-to-date and sustainable educational tool for all to benefit from. You can easily contribute learning materials to the Skillbuilding Hub via GitHub online.

Subject:
Applied Science
Information Science
Material Type:
Lesson
Primary Source
Provider:
DataONE
Date Added:
03/21/2022
Data Management and Governance Glossary
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

A Claremont Graduate University EDUC 448 Fall 2021 Course Publication

Short Description:
This glossary is intended to support professionals who are seeking to understand Data Management and Governance in the context of K-12 and higher education. The definitions included in this ebook provide a fundamental understanding of common Data Management and Governance terms. This glossary was co-created by education professionals and graduate students enrolled in Claremont Graduate University’s EDUC 448: Data Management & Governance course taught by Dr. Gwen Garrison, PhD during the Fall 2021 semester.

Word Count: 2578

(Note: This resource's metadata has been created automatically by reformatting and/or combining the information that the author initially provided as part of a bulk import process.)

Subject:
Applied Science
Computer Science
Information Science
Material Type:
Textbook
Provider:
Claremont Colleges
Date Added:
01/11/2021
Data Management with SQL for Ecologists
Unrestricted Use
CC BY
Rating
0.0 stars

Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Christina Koch
Donal Heidenblad
Katy Felkner
Rémi Rampin
Timothée Poisot
Date Added:
03/20/2017
Data Management with SQL for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

This is an alpha lesson to teach Data Management with SQL for Social Scientists, We welcome and criticism, or error; and will take your feedback into account to improve both the presentation and the content. Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
Peter Smyth
Date Added:
08/07/2020
Data Organization in Spreadsheets for Ecologists
Unrestricted Use
CC BY
Rating
0.0 stars

Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Christie Bahlai
Peter R. Hoyt
Tracy Teal
Date Added:
03/20/2017
Data Organization in Spreadsheets for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

Lesson on spreadsheets for social scientists. Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:
Applied Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
David Mawdsley
Erin Becker
François Michonneau
Karen Word
Lachlan Deer
Peter Smyth
Date Added:
08/07/2020
Data Quality Control and Assurance
Unrestricted Use
Public Domain
Rating
0.0 stars

Quality assurance and quality control are phrases used to describe activities that prevent errors from entering or staying in a data set. These activities ensure the quality of the data before it is collected, entered, or analyzed, as well as actively monitoring and maintaining the quality of data throughout the study. In this lesson, we define and provide examples of quality assurance, quality control, data contamination and types of errors that may be found in data sets. After completing this lesson, participants will be able to describe best practices in quality assurance and quality control and relate them to different phases of data collection and entry.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lesson
Provider:
DataONE
Author:
DataONE Community Engagement & Outreach Working Group
Date Added:
11/21/2020
Data Sharing
Unrestricted Use
Public Domain
Rating
0.0 stars

When first sharing research data, researchers often raise questions about the value, benefits, and mechanisms for sharing. Many stakeholders and interested parties, such as funding agencies, communities, other researchers, or members of the public may be interested in research, results and related data. This lesson addresses data sharing in the context of the data life cycle, the value of sharing data, concerns about sharing data, and methods and best practices for sharing data.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lesson
Provider:
DataONE
Author:
DataONE Community Engagement & Outreach Working Group
Date Added:
11/21/2020
Data Sharing, Mandates, and Repositories
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

Some research funders have a mandate for data resulting from their funded research to be shared. This presentation provides a general definition of data sharing and how scholars can identify and follow data sharing mandates.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lecture
Author:
Kristy Padron
Date Added:
11/22/2020
Data Training Engaging End-users
Conditional Remix & Share Permitted
CC BY-SA
Rating
0.0 stars

Data Tree is a free online course with all you need to know for research data management, along with ways to engage and share data with business, policymakers, media and the wider public. The self-paced training course will take 15 to 20 hours to complete in eight structured modules. The course is packed with video, quizzes and real-life examples of data management, along with valuable tips from experts in data management, data sharing and science communication. The training course materials will be available for structured learning, but also to dip into for immediate problem solving.

Data Tree is funded by the Natural Environment Research Council (NERC) through the National Productivity Investment Fund (NPIF), delivered by the Institute for Environmental Analytics and Stats4SD and supported by the Institute of Physics.

Subject:
Applied Science
Information Science
Material Type:
Module
Primary Source
Date Added:
05/16/2022
Data Wrangling and Processing for Genomics
Unrestricted Use
CC BY
Rating
0.0 stars

Data Carpentry lesson to learn how to use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. A lot of genomics analysis is done using command-line tools for three reasons: 1) you will often be working with a large number of files, and working through the command-line rather than through a graphical user interface (GUI) allows you to automate repetitive tasks, 2) you will often need more compute power than is available on your personal computer, and connecting to and interacting with remote computers requires a command-line interface, and 3) you will often need to customize your analyses, and command-line tools often enable more customization than the corresponding GUI tools (if in fact a GUI tool even exists). In a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson, you will be applying this new knowledge to carry out a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Subject:
Applied Science
Computer Science
Genetics
Information Science
Life Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Adam Thomas
Ahmed R. Hasan
Aniello Infante
Anita Schürch
Dev Paudel
Erin Alison Becker
Fotis Psomopoulos
François Michonneau
Gaius Augustus
Gregg TeHennepe
Jason Williams
Jessica Elizabeth Mizzi
Karen Cranston
Kari L Jordan
Kate Crosby
Kevin Weitemier
Lex Nederbragt
Luis Avila
Peter R. Hoyt
Rayna Michelle Harris
Ryan Peek
Sheldon John McKay
Sheldon McKay
Taylor Reiter
Tessa Pierce
Toby Hodges
Tracy Teal
Vasilis Lenis
Winni Kretzschmar
dbmarchant
Date Added:
08/07/2020
Data Wrangling with R
Conditional Remix & Share Permitted
CC BY-NC-SA
Rating
0.0 stars

Cleaning, reshaping, and transforming data for analysis and visualization, with R and the Tidyverse

Word Count: 3515

(Note: This resource's metadata has been created automatically by reformatting and/or combining the information that the author initially provided as part of a bulk import process.)

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Social Science
Sociology
Statistics and Probability
Material Type:
Textbook
Provider:
College of DuPage Press, 2022
Author:
Christine Monnier
Date Added:
07/13/2022
Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition
Unrestricted Use
CC BY
Rating
0.0 stars

Access to data is a critical feature of an efficient, progressive and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data (‘analytic reproducibility’). To investigate this, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), although not all data appeared reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). For 35 of the articles determined to have reusable data, we attempted to reproduce 1324 target values. Ultimately, 64 values could not be reproduced within a 10% margin of error. For 22 articles all target values were reproduced, but 11 of these required author assistance. For 13 articles at least one value could not be reproduced despite author assistance. Importantly, there were no clear indications that original conclusions were seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.

Subject:
Applied Science
Information Science
Material Type:
Reading
Provider:
Royal Society Open Science
Author:
Alicia Hofelich Mohr
Bria Long
Elizabeth Clayton
Erica J. Yoon
George C. Banks
Gustav Nilsonne
Kyle MacDonald
Mallory C. Kidwell
Maya B. Mathur
Michael C. Frank
Michael Henry Tessler
Richie L. Lenne
Sara Altman
Tom E. Hardwicke
Date Added:
08/07/2020
Database (08:01): Database Fundamentals
Only Sharing Permitted
CC BY-ND
Rating
0.0 stars

The first video in our database lesson, part of the Introduction to Computer series.
This video looks at the basics of databases. We define database, as well as key terms to know.

Subject:
Applied Science
Business and Communication
Information Science
Material Type:
Lecture
Provider:
Mr. Ford's Class
Author:
Scott Ford
Date Added:
09/26/2014
Database (08:02): Database Management Systems
Only Sharing Permitted
CC BY-ND
Rating
0.0 stars

Database Management Systems is the software that allows us to create and use a database. This video looks at the DBMS, their functions, some examples of popular software solutions and a quick look at Structured Query Language (SQL)

Subject:
Applied Science
Business and Communication
Information Science
Material Type:
Lecture
Provider:
Mr. Ford's Class
Author:
Scott Ford
Date Added:
09/26/2014
Database (08:03): Database Models
Only Sharing Permitted
CC BY-ND
Rating
0.0 stars

The database management software is the program used to create and mange the database. The database model is the architecture the DBMS used to store objects within that database.

Subject:
Applied Science
Business and Communication
Information Science
Material Type:
Lecture
Provider:
Mr. Ford's Class
Author:
Scott Ford
Date Added:
09/26/2014
Database (08:04): Some Final Bits
Only Sharing Permitted
CC BY-ND
Rating
0.0 stars

Our final database video. This one looks at some odds and ends. We examine: Data Warehouse, Data Mining, Big Data. I also talk about the ethics of data mining from the NSA and CDC, and how they are different.

We also give out top picks for the lesson.

Links from Video:
•http://www.w3schools.com/sql/
•What is Database & SQL by Guru99 http://youtu.be/FR4QIeZaPeM
•What is a database http://youtu.be/t8jgX1f8kc4
•MySQL Database For Beginners https://www.udemy.com/mysql-database-for-beginners2/

Subject:
Applied Science
Business and Communication
Information Science
Material Type:
Lecture
Provider:
Mr. Ford's Class
Author:
Scott Ford
Date Added:
09/26/2014
Database Design - 2nd Edition
Unrestricted Use
CC BY
Rating
0.0 stars

Short Description:
Database Design - 2nd Edition covers database systems and database design concepts. New to this edition are SQL info, additional examples, key terms and review exercises at the end of each chapter.

Long Description:
This second edition of Database Design book covers the concepts used in database systems and the database design process. Topics include: The history of databases Characteristics and benefits of databases Data models Data modelling Classification of database management systems Integrity rules and constraints Functional dependencies Normalization Database development process

Word Count: 30650

(Note: This resource's metadata has been created automatically by reformatting and/or combining the information that the author initially provided as part of a bulk import process.)

Subject:
Applied Science
Computer Science
Information Science
Material Type:
Textbook
Provider:
BCcampus
Date Added:
10/24/2014
Databases and SQL
Unrestricted Use
CC BY
Rating
0.0 stars

Software Carpentry lesson that teaches how to use databases and SQL In the late 1920s and early 1930s, William Dyer, Frank Pabodie, and Valentina Roerich led expeditions to the Pole of Inaccessibility in the South Pacific, and then onward to Antarctica. Two years ago, their expeditions were found in a storage locker at Miskatonic University. We have scanned and OCR the data they contain, and we now want to store that information in a way that will make search and analysis easy. Three common options for storage are text files, spreadsheets, and databases. Text files are easiest to create, and work well with version control, but then we would have to build search and analysis tools ourselves. Spreadsheets are good for doing simple analyses, but they don’t handle large or complex data sets well. Databases, however, include powerful tools for search and analysis, and can handle large, complex data sets. These lessons will show how to use a database to explore the expeditions’ data.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Amy Brown
Andrew Boughton
Andrew Kubiak
Avishek Kumar
Ben Waugh
Bill Mills
Brian Ballsun-Stanton
Chris Tomlinson
Colleen Fallaw
Dan Michael Heggø
Daniel Suess
Dave Welch
David W Wright
Deborah Gertrude Digges
Donny Winston
Doug Latornell
Erin Alison Becker
Ethan Nelson
Ethan P White
François Michonneau
George Graham
Gerard Capes
Gideon Juve
Greg Wilson
Ioan Vancea
Jake Lever
James Mickley
John Blischak
JohnRMoreau@gmail.com
Jonah Duckles
Jonathan Guyer
Joshua Nahum
Kate Hertweck
Kevin Dyke
Louis Vernon
Luc Small
Luke William Johnston
Maneesha Sane
Mark Stacy
Matthew Collins
Matty Jones
Mike Jackson
Morgan Taschuk
Patrick McCann
Paula Andrea Martinez
Pauline Barmby
Piotr Banaszkiewicz
Raniere Silva
Ray Bell
Rayna Michelle Harris
Rémi Emonet
Rémi Rampin
Seda Arat
Sheldon John McKay
Sheldon McKay
Stephen Davison
Thomas Guignard
Trevor Bekolay
lorra
slimlime
Date Added:
03/20/2017
Data.gov
Unrestricted Use
Public Domain
Rating
0.0 stars

The home of the U.S. Government’s open data. Here you will find data, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more. Topics include Agriculture, Business, Climate, Education, Energy, Ecosystems, Manufacturing and more.

Subject:
Applied Science
Information Science
Life Science
Physical Science
Social Science
Material Type:
Data Set
Provider:
U.S. General Services Administration
Provider Set:
Office of Citizen Services and Innovative Technologies
Date Added:
03/04/2016