Start over

Hansard French/English.

Format

Data file

Language

French
English

Published/Created

Philadelphia : Linguistic Data Consortium, 1995.

Description

1 online resource

Details

Subject(s)

French language—Canada—Databases [Browse]
Linguistics—Databases [Browse]

Issuing body

Linguistic Data Consortium [Browse]

Library of Congress genre(s)

Databases [Browse]

Restrictions note

Use of these data is restricted to Princeton University students, faculty, and staff for non-commercial statistical analysis and research purposes only.

Summary note

The Hansard Corpus consists of parallel texts in English and Canadian French, drawn from official records of the proceedings of the Canadian Parliament. While the content is therefore limited to legislative discourse, it spans a broad assortment of topics and the stylistic range includes spontaneous discussion and written correspondance along with legislative propositions and prepared speeches. The collection presented here has been assembled by the LDC by way of archives from two distinct secondary sources. Material from one time period of parliamentary proceedings was acquired through the IBM T. J. Watson Research Center, while material from another period was acquired through Bell Communications Research Inc. (Bellcore). The combined collection covers a time span from the mid-1970's through 1988, with no apparent duplication between the two data sources. Aside from covering different time periods, the two archives have different organization and have undergone different amounts and kinds of processing in being prepared as a parallel language resource. In addition, the Bellcore set itself comprises two distinct types of data -- one appears to be the main parliamentary proceedings (similar in nature to the IBM set), while the other consists of transcripts from committee hearings. The three sets have been kept distinct in this publication and each is described in greater detail in separate documentation files on the CD-ROM. In terms of what the three sets have in common: * They are rendered here using the 8-bit ISO-Latin1 character encoding standard. * They use a minimal amount of SGML tagging to identify sentences or paragraphs. * All sets are organized using a parallel file structure, in which the content of a given English text file is matched by the content of a corresponding French text file. * The SGML text files for the IBM and the Bellcore committee-hearings data are published in compressed form, using the public-domain GNU-Zip utility (gzip). The Bellcore m

Notes

Data accessible via the Data and Statistical Services (DSS) website.

Source of description

Title from Princeton University's Data and Statistical Services website (viewed on July 31, 2017).

Language note

French and English.

OCLC

896409163

Other standard number

LDC95T20
https://catalog.ldc.upenn.edu/LDC95T20
ISBN: 1-58563-048-9
ISLRN: 711-183-299-010-5

Statement on responsible collection description

Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage. Read more...

Other views: Staff view

Princeton University Library Catalog

Hansard French/English.

Details

Supplementary Information