Technology — December 13, 2018 at 7:02 pm

Data driven 2018

by

By Chelsea Aplhonso

What is data driven?

On a snowy mid-November Friday  in downtown Toronto, more than 85 people gathered to talk at Google Canada Headquarters about all things data journalism. Journalists discussed their ways of investigating things — from recidivism in securities fraud to number crunching how many people died in Grey’s Anatomy. 

The Data Driven 2.0 symposium, presented by Humber was the second data journalism confer-ence featuring a line-up of coders, investigative journalists and educators from Canada and the U.S. and as far as South Sudan.

Guest speakers discussed the challenges in gathering data, the ways in which information is presented, and the importance of ensuring the information is accessible across all devices. 

Through Lunch and Learns and story Show and Tells data journalists also talked about web scrapers, coding, spreadsheets, data literacy and ways to include data into regular beat reporting.

Carolyn Thompson and South Sudancese Journalist Lagu Joseph discuss the difficulty and dangers of pursuing data in a hostile country. (Photo courtesy of Chelsea Alphonso)

Tools for data journalists

Collecting and organizing data can seem like a task that is too large to take on if the right tools aren’t available. 

Journalists at Data Driven discussed Python and the time saving and ease it brings in collecting and cleaning up large data sets. 

Jessie Willms talked about getting started when working on data projects. 

“Start with a focusing question, choose the right tools, find an approach that works (or works best), refine your idea, build something for users to see,” she says.  

There are a number of tools that journalists lean on when trying to find an approach that works.

1. Spreadsheets

A great place to start with your project is by compiling the data in a spreadsheet. The versatility and ease of use make spreadsheets a go-to tool for data journalists. Excel is considered standard.

2. SQL 

If you are dealing with a particularly large set of data that needs to be sorted and analyzed (or “queried”), SQL is the natural next step. SQL allows you to extract specific subsets of data or perform these queries across related data sets. You can also save your previous queries so that you can document everything you have done with your data and easily repeat it in the future. 

3. Data cleaning tools 

Once the data is compiled, it is time to clean the data and put it in a useful format. There are many tools that can be used for this process, such as OpenRefine or Winpure. It just depends on what you need it to do specifically and on your personal preferences. 

4. Visualization tools 

Visualization is not just a decoration, but a good way to make your data easily understood. Whether you use a graph, chart or infographic tools such as Tableau, Google Data Studio, Open Refine are useful ways to depict, and show, outliers and trends.

5. Programming Languages

Programming languages such as Python are the most helpful with data extraction and database building. Coding allows you to build your own very specific tools for each dataset you work with.

(Left to right) Takara Small, Adam Hooper, Lewis Wynne-Jones and Francesca Fionda discuss the new era of collaboration, experimentation and growth in news organizations. (Photo courtesy of Chelsea Alphonso)

Using data in everyday life

Data can be collected in every facet our of lives, from how many hours we collectively spend on Netflix every month to how many apartments are available for under $1700 in Toronto.

Francesca Fionda encourages journalists to look at it as something more than numbers. 

“It is an important and powerful tool and you should treat it like a character you would in any story. Interrogate it, ask questions,”she says.

Data can often feel intimidating, says Fionda.  The amount of raw data available continuously grows larger and larger. 

But Fionda says there’s two things that are important two keep in mind.

“Curiosity, always be questioning,” she says, “and don’t be afraid to question how data was gathered and where it is coming from.”

“Second, tenacity, you just need to keep going. Often, there are a lot of barriers to getting a lot of information.”

Fionda says that means sometimes, difficult information doesn’t make it out there. 

“It is easy to stop when someone says no or isn’t available,” she says. 

Fionda says the key is to treat the inevitable “no” as a maybe.  

For journalists who want to learn more about data journalism, Fionda says that finding someone who is doing what you want to be doing is a good start. 

Jessie Willms, a visual journalist at the Globe and Mail and a presenter at Data Drive, discussed the idea of developing your skills through a side interest. 

Willms wanted to learn Python. She also wanted to know how many people died on Grey’s Anatomy. So, she used data from the TV show to teach herself to code — with, of course, a little help from her colleauges.

Like a credible source, data provides a level of credibility and can substantiate claims that otherwise wouldn’t hold weight. When statistics aren’t always readily available, raw data is the next best thing.

Fionda says it’s not that scary to deal with data.

 “Don’t be afraid of numbers, don’t be afraid of spreadsheets. It can be part of every single story that you do, and you don’t need a lot of skills to jump in.” 

 

Adam Hooper discusses his opensource tool Workbench. (Photo courtesy of Chelsea Alphonso)

Teaching and learning with data

The constant changes in the journalism industry mean that there need to be constant changes when educating aspiring young journalists, too. David Weisz is one of the journalism faculty members at Humber College involved in spearheading a curriculum update for both the college’s advanced diploma program and the post-graduate program. 

“The emphasis on teaching data in journalism school is the realization that more and more of our lives are digitized,” Weisz says. “More and more of our information is being held in databases that we sometimes never get to see except for when there is a massive privacy breach.” 

He says this means journalism has to keep up with the times.

“Journalists have to be prepared to be able to interpret everything — they have to be as comfortable sifting through data online as they are speaking and interviewing people one the street,” says Weisz. 

Although data won’t be the main focus of the Journalism program, Weisz says that a major goal is to make students comfortable with interrogating data as a source. 

Social networks and the spread of information online have made it more and more important for journalist to be able to at least have a passing knowledge of where to find data. 

“The same way that everyone can essentially write a lead, everyone can do an inverted pyramid or write a story from start to finish— that is what we are essentially trying to do but with data,” Weisz says. 

Weisz says teaching data as a frame of mind will be the angle Humber starts with. 

He defines a “data state of mind” as understanding where to find the data and how to communicate the data  to the readers,. 

At this year’s  Data Driven event, although many speakers talked about the value of using programming languages like Python to make the processes of data journalism smoother Weisz says that learning to code is not necessary to develop or have data fluency. 

Fred Vallance-Jones, who teaches at  King’s College in Halifax, one of the country’s data journalism educational hubs says that at its core, data journalism is about working with different electronic data sets and making sense of them.  

As more and more of our lives become integrated with technology, data will become a greater part of every story. 

Vallance-Jones, author of  Digging Deeper, A Canadian Reporters Research Guide, hosted a lunch and learn at the event where he introduced entry level data concepts and skills to attendees.

“If you think about it every single day, more information is inputted in the world, through social networks and other sort of data channels, than existed in the entire history of humanity. So, if journalists aren’t able to work with data then they are basically ignoring almost all the information in the world,” says Vallance-Jones. 

 

Leave a Comment

Your email address will not be published. Required fields are marked *