Question



Using the mapping.csv file with Article Data

Dear Dr. Ballings,

When we are creating our Article Data basetable, how do we use the mapping.csv file? I know we have to merge the 'companies' files and the
'stories' files, and then need to use the mapping.csv file to match the COMPANY_ID with the TICKER. I am confused about what to do with the
RANGE_START and RANGE_END columns ?

Thank you for your help, I greatly appreciate it!





Answers and follow-up questions





Answer or follow-up question 1

Dear student,

The mapping file contains the links between the internal company ID and the ticker name.
With internal I mean only valid in the three article files.

The internal company ID may be assigned to a different ticker name on different moments in time.

For example, company ID 83E0B2 links to EBOD between 2000-01-01 and 2013-02-11
The same ID links to EBODF between 2000-01-01 and now (now is Jan 1, 2016).

COMPANY_ID,TICKER,RANGE_START,RANGE_END
83E0B2,EBOD,2000-01-01,2013-02-11
83E0B2,EBODF,2000-01-01,
83E0B2,TRFDF,2000-01-01,2006-01-09
83E0B2,CTDC,2006-01-09,2012-09-24

EBOD, EBODF, TRFDF, CTDC are all connected to the same company (think mergers, acquisitions, ticker name changes).
At this point, only EBOD is still active.
http://www.bloomberg.com/quote/EBOD:US
http://www.bloomberg.com/quote/EBODF:US
http://www.bloomberg.com/quote/TRFDF:US
http://www.bloomberg.com/quote/CTDC:US

The same company ID can be related to multiple tickers (as a company can have multiple tickers at the same time).

How to handle this? When you merge your stories to the mapping file, make sure to only keep the tickers that are active
at the time when your article is published.

For example, first do an inner merge, and then delete the rows that contain articles that do not fall in between
range_start and range_end.

Michel Ballings






Sign in to be able to add an answer or mark this question as resolved.