MSc Defence: Harshavardhan Gadgil
Date and Time
Location
Room 101 J.D. MacLaughlan Building
Details
Title: Data Integration from Mulitple Historical Sources to Study Canadian Casualties of WW1
Abstract:
Longitudinal data (data that tracks entities over a period of time), is of interest to historians and social scientists because it creates opportunities to perform comprehensive analyses about chronological events. In this thesis, we construct longitudinal data by integrating data from four historical sources to study Canadian casualties of World War I. Due to the unavailability of labeled data for two out of three linkage tasks and our application's low tolerance for false matches, we develop a simple stepwise deterministic strategy to integrate the four datasets. For one of three linkage tasks where labeled data is available, we compare the strategy with linkage that incorporates a Support Vector Machine. With the longitudinal dataset constructed, we demonstrate its utility by performing a multivariate regression analysis to determine the factors that influenced a Canadian soldier's likelihood of survival in World War I. The findings of this research indicate that a carefully crafted stepwise deterministic strategy that incorporates approximate comparisons and domain knowledge can perform on par with a linkage approach that incorporates a supervised learning algorithm. The regression analysis reveals several fascinating patterns of historical importance in early 19th century Canada, demanding further historical investigation.
Chair: Dr. Xining Li
Co-Advisor: Dr. Luiza Antonie
Advisory Committee Member: Dr. Kris Inwood
Non-Advisory Committee Member: Dr. Judi McCuaig