RDIM Terminology Data Wrangling (Cleaning)
Data Wrangling (Cleaning)
- ÁñÁ«ÊÓƵ
- JCU Global Experience
- International Students
- Open Day
- How to apply
- Pathways to university
- Virtual Open Day
- Living on Campus
- Courses
- Publications
- Scholarships
- Parents and Partners
- JCU Heroes Programs
- Aboriginal and Torres Strait Islander in Marine Science
- Elite Athletes
- Defence
- AI@JCU
- Current Students
- New students
- JCU Orientation
- LearnJCU
- Placements
- CEE
- Unicare Centre and Unicampus Kids
- Graduation
- Off-Campus Students
- JCU Job Ready
- Safety and Wellbeing
- JCU Prizes
- Professional Experience Placement
- Employability Edge
- Art of Academic Writing
- Art of Academic Editing
- Careers and Employability
- Student Equity and Wellbeing
- Career Ready Plan
- Careers at JCU
- Partners and Community
- Alumni
- About JCU
- Reputation and Experience
- Chancellery
- Governance
- Celebrating 50 Years
- Academy
- Indigenous Engagement
- Education Division
- Graduate Research School
- Research and Teaching
- Research Division
- Research and Innovation Services
- CASE
- College of Business, Law and Governance
- College of Healthcare Sciences
- College of Medicine and Dentistry
- College of Science and Engineering
- CPHMVS
- Anthropological Laboratory for Tropical Audiovisual Research (ALTAR)
- Anton Breinl Research Centre
- Agriculture Technology and Adoption Centre (AgTAC)
- Advanced Analytical Centre
- AMHHEC
- Aquaculture Solutions
- AusAsian Mental Health Research Group
- ARCSTA
- Lions Marine Research Trust
- Australian Tropical Herbarium
- Australian Quantum & Classical Transport Physics Group
- Boating and Diving
- Clinical Psychedelic Research Lab
- Centre for Tropical Biosecurity
- Centre for Tropical Bioinformatics and Molecular Biology
- CITBA
- CMT
- Centre for Disaster Solutions
- CSTFA
- Cyclone Testing Station
- The Centre for Disaster Studies
- Daintree Rainforest Observatory
- JCU Eduquarium
- JCU Turtle Health Research
- Language and Culture Research Centre
- MARF
- Orpheus
- TESS
- JCU Ideas Lab
- TARL
- eResearch
- Indigenous Education and Research Centre
- Estate
- Work Health and Safety
- Staff
- Discover Nature at JCU
- Cyber Security Hub
- Association of Australian ÁñÁ«ÊÓƵ Secretaries
- Services and Resources Division
- Environmental Research Complex [ERC]
- Foundation for Australian Literary Studies
- Gender Equity Action and Research
- Give to JCU
- Indigenous Legal Needs Project
- Inherent Requirements
- IsoTropics Geochemistry Lab
- IT Services
- JCU Webinars
- JCU Events
- JCU Motorsports
- JCU Sport
- Library
- Mabo Decision: 30 years on
- Marine Geophysics Laboratory
- Office of the Vice Chancellor and President
- Outstanding Alumni
- Pharmacy Full Scope
- Planning for your future
- Policy
- PAHL
- Queensland Research Centre for Peripheral Vascular Disease
- Rapid Assessment Unit
-
RDIM
- Introduction
- RDIM Overview
- My Responsibilities
- Research Data JCU Platform
- Step 1 - Plan
- Step 2 - Manage
- Step 3 - Archive
- Step 4 - Publish
- Step 5 - Reuse
- Step 6 - Review
- Step 7 - Dispose
-
Terminology
- Access Conditions (Open, Conditional, Restricted)
- Active Data
- Active Storage and Collaboration Options
- Citations
- Collaborator
- Completed Data
- Conditional Access
- Confidentiality
- Consent
- Contracts
- Copyright
- Creative Commons Licence
- Creative Commons Zero (CC0)
- Custodian
- Custodianship
- Data
- Data Creator
- Data Custodian
- Data Manager
- Digital Object Identifier (DOI)
- Data Package
- Data Papers
- Data Publication
- Data Record
- Data Repositories
- Data Retention
- Data Storage - Active Data or Working Data
- Data Storage - Completed Data
- Data Visualisation
- Data Wrangling (Cleaning)
- De-identifying Data
- Digital Object Identifier (DOI)
- DIKW Model
- DOI Minting Services
- Embargo
- Ethics and Ethical Clearance
- FAIR Data Principles
- File Formats
- File Names
- Folder Structures
- HDR Candidate
- Information
- Intellectual Property
- JCU Researcher
- Lead Investigator
- Licensing Data
- Metadata
- Moral Rights
- Open Access
- Primary Advisor
- Primary Materials
- Privacy and Personal Information
- Repositories
- Research Data
- Research (Data and Information) Asset
- Research (Data and Information) Asset Lifecycle
- Research Data JCU Platform
- Research Data Management Plan (RDMP)
- Research Information
- Research Project
- Restricted Access
- Retention
- Retention Rules for Specific Data Types
- Sensitive Data
- Storage
- Supporting Documents
- Triangulation, Data Linkage and Integrating Authorities
- Version Control
- Working Data
- Wrangling (Cleaning) Data
- Frequently Asked Questions
- Information Sheets
- Training Videos
- Site Map
- Contact Us
- Researcher Development Portal
- Roderick Centre for Australian Literature and Creative Writing
- Contextual Science for Tropical Coastal Ecosystems
- State of the Tropics
- Strategic Procurement
- Student profiles
- SWIRLnet
- TREAD
- TropEco for Staff and Students
- TQ Maths Hub
- TUDLab
- VAVS Home
- WHOCC for Vector-borne & NTDs
- Media
- Copyright and Terms of Use
- Australian Institute of Tropical Health & Medicine
- Pay review
Data Wrangling or Data Cleaning is the process of identifying and correcting errors and/or making formatting more consistent. It’s often required to prepare data for analysis and/or visualisation, and (where appropriate) when publishing and sharing data. Data also needs to be cleaned before archiving. This will ensure that it’s preserved correctly, is not misinterpreted by other users, and facilitates interoperability (one of the FAIR Principles).
White et al (2013) published an excellent paper ‘ in Ideas in Ecology and Evolution. The authors noted that much of the shared data in ecology and evolutionary biology is not easily reused because they don't follow best practices in terms of data structure, metadata and licences.
Their nine specific recommendations are:
- Share your data.
- Provide metadata.
- Provide an unprocessed form of the data.
- Use standard data formats.
- Use good null values.
- Make it easy to combine your data with other datasets.
- Perform basic quality control.
- Use an established repository.
- Use an established and liberal license