Most linkage failure is attributable to the very high migration characteristic of the mid-nineteenth century. The availability of high-quality census files including entire populations will allow far more sophisticated matching than has previously been possible. Using the new database, for example, entire countries can be searched using characteristics such as age, sex, birthplace, birthplace of mother, and birthplace of father as well as name.

The new database will allow a far higher rate of matches than have previous studies and will be able to provide samples thousands of times larger. Moreover, because the analyses will be based on representative populations at both ends of the record linkage, any biases in the linked population will be readily detectable. Linked census data holds the promise of finally resolving some of the longest-running debates in nineteenth-century social history.

Past studies of social and geographic mobility were ultimately inconclusive because of their exclusion of migrants and their small sample size. Scholars will be able to gauge the extent of social and geographic mobility, analyze the interrelationship of geographic and economic movement, and assess trends and differentials in social mobility far more reliably than heretofore Thorvaldsen In addition, the linked samples will allow investigation of questions regarding family formation and dissolution.

For example, they will allow us to answer several controversial questions surrounding the formation of multigenerational households in the nineteenth century Ruggles a, Multilevel analysis. In recent years, multilevel analyses of the effects of local context on individual behavior have proven exceedingly valuable tools for research in historical sociology see for examples Elman ; Kramarow ; Ruggles a, b. A key problem for such nineteenth-century research, however, is that the method requires independent variables tabulated for small geographic units, and such data are scarce before the twentieth century.

The new North Atlantic sample will allow creation of a wide variety of contextual variables—such as racial or ethnic composition, female labor-force participation, and occupational structure—at any geographic level, including the block, the neighborhood, and the enumeration district. Geographic Information Systems. Geographers are ordinarily unable to tap the power of microdata.

The existing nineteenth-century microdata files are samples, so when they are used for small areas they provide insufficient precision for reliable mapping. Although some relatively high-density samples are available for the period since , those microdata files suppress detailed geographic data. Therefore, geographers are forced to rely on complete count aggregate data that usually provide only basic summary statistics for small areas.

The North Atlantic census database will provide full geographic detail for every individual in the population. Digitized small-area boundary files are already in preparation for nineteenth-century Norway and Britain, and a pending proposal to the National Science Foundation would provide a similar resource for the United States.

Thus, there is already a large scholarly investment in nineteenth-century geographic information systems. What is lacking is a fine level of geographic detail in social, economic, and demographic characteristics.

The North Atlantic census database will allow scholars to marry existing geographic boundary files to population characteristics, thus creating a powerful new analytic tool. Such fine geographic analysis will be especially potent in the analysis of topics such as early suburban development and racial and ethnic residential segregation see for example Gardner A cross-section encompassing the entire population of the North Atlantic world in the late nineteenth century will open up vast new terrain in the fields of history, economics, demography, and sociology.

The censuses include a great deal of information on demography and social structure that can only be taken advantage of through the creation of a new microdata set. The late nineteenth century is a critical period in the study of fertility decline, urbanization, international migration, household composition and occupational structure.

The database will allow the construction of cross-tabulations on a wide range of topics that were not covered by census publications or were incompletely tabulated. Perhaps even more important is the potential for longitudinal and multilevel multivariate analyses opened by the availability of the database.

The North Atlantic census database will not only constitute an invaluable resource in its own right, but will also enhance the value of the previously created historical microdata samples. Used in combination these microdata will constitute our most important resource for the study of nineteenth-century social structure. A full discussion of the specific topics that could be addressed with a complete machine-readable database of the nineteenth century censuses of five countries would require many pages.

The paragraphs that follow sketch only a few of the most obvious research applications of the new database. The first Industrial Revolution may have begun in Lancashire, but by the late nineteenth century, the entire North Atlantic world was involved in manufacturing, the production of raw materials, or both. The North Atlantic database will allow unprecedented opportunities to explore economic structures within and between each nation during this critical transitional period.

For the first time, we will have consistently coded occupational data available for multiple nineteenth century countries, and it will be available at the individual level for the entire population. This will allow comparative analysis at the level of persons, families, communities or regions, and investigation of the geographic organization of economic activity.

In four of the five countries, for example, mechanized textile manufacturing existed, and the census provides sufficient occupational detail to analyze the organization of the industry in each locality. All five nations were deeply involved in and interconnected by maritime industries. They competed in the rich North Atlantic fishery and in the transatlantic shipping trade. The North Atlantic database will not only reveal the structure of maritime industries, but also will allow the comparative investigation of maritime communities. Fertility transition.

At the time these censuses were taken, each of the North Atlantic countries was just beginning deliberate fertility limitation. The North Atlantic database will allow study of differential fertility patterns in this critical period of demographic transition, to assess the importance of such factors as occupational class, ethnicity, region, literacy, local economy, size of locality and family structures.

Study of this elemental shift in population structure has the potential to enhance our understanding of ongoing demographic change in the contemporary developing world. Past comparative analyses of the European fertility transition have relied on aggregate vital statistics Coale and Watkins This approach has two major disadvantages. First, aggregate vital statistics do not allow direct measures of child spacing or stopping behavior; only the level of fertility can be considered. Second, the aggregate approach does not allow control of individual-level socioeconomic characteristics.

The new database will allow analysis of fertility differentials through own-child methods Cho, Retherford and Choe Own-child methods of fertility analysis require very large datasets, and are therefore especially well suited to complete population databases. Thus, the database will allow a new and more subtle generation of comparative studies of the first demographic transition.

Household and family composition. For more than a century, political theorists, sociologists and historians have been debating the relationship between industrialization and the family. In the s, a series of British, Canadian and American studies argued that the harsh economic conditions of early industrial capitalism strengthened the interdependence of family members and led to a high frequency of complex households Anderson ; Hareven , ; Katz ; Foster ; Modell Each of these analyses focused on a single industrializing community, and so were unable to test the proposed association between industrial development and family or household composition.

Comparisons across national boundaries have also been inhibited by inconsistencies in the construction of measures of household composition. Thus, there is presently little agreement about national similarities and differences in family and household composition in the late nineteenth century. Some of the most promising recent work has focused on relatively small population subgroups, such as the living arrangements of the aged or of unmarried mothers of young children, but only the largest samples are capable of supporting such investigations. The North Atlantic database will include a common set of constructed variables to aid in the analysis of family and household composition and will thus allow consistent comparisons across all five countries.

It will allow investigators to assess the impact of local context on family systems through multilevel analysis, and thus for the first time permit analysis of the effects of individual-level factors, local economic conditions, regional inheritance systems, and national characteristics on the nineteenth-century family. International migration. The late nineteenth century saw international population movements on an unprecedented scale.

The massive North Atlantic migration profoundly shaped both the receiving and contributing countries. The great majority of emigrants from Norway, Iceland and Britain went to Canada and the United States, and the influx transformed North American society.

Many of these newcomers remained only a few years before returning to their homelands, often bringing home money and always bringing new ideas and experiences Runblom and Norman ; Nugent ; Gjerde ; Thorvaldsen The North Atlantic database will be a wonderful resource for the study of migration history. It will allow close and consistent comparisons of occupational structure, marriage patterns, fertility and family composition. Researchers will be able to identify and compare specific sending and receiving communities. In some instances, it will even be possible to follow individual migrants across the Atlantic and back again.

In combination with new machine-readable ship lists and emigration registers, the database will open a new window on the implications of international population flows.

In addition to scholarly research, we anticipate that the new database will make important contributions to teaching in the social sciences, helping to bring the excitement of discovery into the classroom. The detailed geographic analysis made possible by the new database makes it a suitable vehicle for introducing a quantitative dimension into secondary, undergraduate and graduate courses focusing on local history. Once the North Atlantic database is created, we plan to collaborate in the development of web-based instructional materials that capitalize on the fine detail available for local areas and small population subgroups.

As part of the project, we will develop a comparative analysis of enumeration procedures and systematically gather evidence on underenumeration from all five countries. The data for each country are presently in the form of alphabetic character strings that represent a transcription of the information collected from each individual in the late nineteenth century.

The individual records are grouped into residential units corresponding to the modern census concepts of household and group quarters. There are approximately 90 million of these records, and the information is recorded in English, French, Icelandic or Norwegian. In their present form, the data have little social science application, because the number of variations of each variable is too great for researchers to digest. For example, we estimate that the data include approximately four million different occupational strings, one million birthplaces and 50, family relationships.

Each country has raised funds to classify these alphabetic strings into numerically coded categories. This work is already underway in Britain, and is scheduled to begin during the coming year in Canada, Norway and the United States. If it were not for the proposed collaboration, each country would code variables strictly according to their own conventions, and the result would be five separate and fundamentally incompatible datasets.

Some variables—age, sex and marital status—can be made comparable with little effort, but the complex variables will require close collaboration to develop common coding standards. Occupational coding is the most challenging component of the project. The fine detail available in the occupational field is one of the reasons why the North Atlantic database has the potential to transform our understanding of historical social structure. At the same time, however, the complexity of occupational structure will demand meticulous care to ensure consistency.

The HISCO system is a modification of the United Nations occupational classification system with extensions to accommodate historical occupations. An international committee with representatives from Belgium, Canada, England, France, Germany, the Netherlands, Norway, Sweden and the United States is near completion of the final description of the system. We will further modify and extend the system to accommodate the additional detail available in the North Atlantic database.

Similarly, we intend to adapt United Nations classification systems as a framework for the other principal complex variables, such as birthplaces, family relationships, group quarters, and ethnicity. To translate from character strings into numeric codes, we must construct a data dictionary that assigns a numeric code to each alphabetic variation that occurs in the data. This work is difficult enough in the context of a single country; for a project of this scale, it requires a team of expert coders who work in close cooperation, sharing coding decisions continuously.