Data management? Does it matter?

Why should you do data management?

We all have experienced unexpected shutdown of our laptops or even files we were still processing and haven’t saved yet. Maybe these nightmares are quite unavoidable sometimes; otherwise, I wouldn’t call it unexpected, and we would not have cursed as it happened.

The most attractive points of data management are

  • To reproduce your work on analysis.
    Science is something that could be reproducible, so is DATA SCIENCE. Even though you might not work in a data-driven field, you definitely want to trace back your file records at a point.
  • To work coherently and efficiently with yourself
    We all tend to forget where and how we put the files someday. Data management saves you a great amount of time of suffering from getting lost in the mess.
  • To help your colleagues to understand the analysis
    It is quite common that if you work in a team, and someone joins after the project has already started, he or she needs to take over a part of the tasks. Then a good data managment gives that new colleague a big picture of what has been completed and what needs to be fixed.
  • To enhance accuracy of work
    A good data management leads to a well-structured folders/file/codes/documents. Once any problem happens, it is easier to break down the error and find the bug.

My experience in poor data management

Looking back my experience in managing data, I made a bunch of mistakes, which definitely annoyed my colleagues and stumbled myself. I don’t mind writing these bad examples down to highlight the importance of good data managment.

Project died on the way

I was once working in a project entitled “Enterovirus epidemic and class suspension in Taiwan”. That was the very first time that I had the access into governmental data. Without any analysis plan, I only used the cloud drive to keep the records of data analyses, which were also incomplete, with my supervisor. However, at the end, I was not able to continue the project and publish the results, so my supervisor took over the work. But when he asked me what we did together in the analyses and what extra work I managed to conduct on my own, I couldn’t properly answer the questions he proposed, because I did not have an analysis plan! All the data was carried out in an aggregation form, which indicated that we were unable to trace back to the working records in the data……If I had learned data managment earlier, the project wouldn’t have to be halted.

My colleagues got mad

The other experience also happened durign my undergrad. I was an NGO intern involved in dengue surveillance project in Northern Malawi. The project took much longer than we expected, so the next year’s interns also needed to participate in the unfinished project. However, we were the investigators at the first place. We understood all the project from the very beginning, but they did not. It took effort to explain what we had finished. At the end, even though we had quited the work, those next-years still did not catch the full results and working process of the project. And as I understood, they were not quite happy about the transition of the project. If I had learned data managment earlier, I would have had better transition of our work.

Solution: good data management

I picked up good data managment while working as a master thesis student at the Department of Medical Biostatistics and Epidemiology. I benefitted from the good data management workshop and courses offered by Anna Johansson and the data management group. I am quite sure given good data management, those two stories would not have occurred! Isn’t that a good news!?

Do you have the same issure in your data management? Then you should learn Seven tips of good data management together.