In order to properly analyse data, you need to structure it first. Here is a couple of tips and tricks on how to do it in an Excel table if you are only at the beginning of your adventure with data journalism.
- You will want to start with a table, which contains rows and columns. Each column corresponds to a variable, and each row corresponds to a record.
- Make sure you include only one header row at the very top of the spreadsheet. It should contain column names, one next to another. If you come across a table with multiple headers, simplify it into a single header or divide the data into multiple tables.
- Remember to include only one type of data per variable – one column should only include one type of data.
- Make copies of spreadsheets with data that you are about to analyse. You might want to use your raw data for another analysis at a later stage so keep the original file untouched.
- Add new data to the table as new rows, not new columns. Columns correspond to new variables (which you haven’t looked at before), not new “data entries” or “data records”.
- Once the data is structured into an orderly table, it is time to decide what’s needed and what’s simply obscuring the big picture. Remove or modify any rows and columns which are not necessary or sufficiently accurate.
- Take care to name your variables (columns) in a clear and concise way. You might not be the only person dealing with the file so making the names as straightforwards as possible is key to make it work.
- Make sure your data for each variable is clear and readable. It must be entered in the uniform way into each column.
- Look out for any missing data and handle it as appropriate. Leaving a cell empty is in most cases safer than inserting a “0”.
- Finally, make sure you format all data according to its type (date/number/location/text…) so that Excel and any other processing software read is correctly.
Hint: All data structured and cleaned? Visualise it. In this post, I explain how to do it quickly and easily.