Last class you asked us to start working on the text mining portion of the final project. In order to do so we need to read the data tables
and merge them.

We have read in the data and have three tables: 'companies', 'stories', and 'mapping'.

This is my question:

Not sure how to merge them since they have a different number of rows. Should I aggregate them first and then merge them, or should I merge
them and then aggregate?

Also, when we aggregate should we do a double aggregate? I say this because the 'companies' table has both a company id and a story id.

Best, Ian Safie

Dear Ian,

Since your stock data is organized as 'each line is one stock for a given day', you want to make sure that the
merging and aggregating you do for the article data arrives at 'one stock for a given day' before merging.

Michel Ballings

