ProQuest

Wind-generated Power: Clean, cheap, sustainable

 
About CSA Products Support & Training News and Events Discovery Guides Contact Us
Quick Links
> CSA Illustrata: Natural Sciences Factsheet
> Object Categories
> Object Statistical Terms
 
 

CSA Illustrata: Natural Sciences FAQs

 
 
FAQ Contents
Indexing and Record Format

    Q: What do we mean by "deep indexing"?

    A: Deep indexing is the process of extracting and interpreting data about tables, figures, or other media objects from the full text of an electronic document, such as an article from a scholarly journal. During the indexing process, records describing each object is created and associated with the abstract record of the original article.

    Q: How are CSA Illustrata index terms created?

    Object descriptor terms and/or phrases are added to object records to enhance retrievability. These descriptors are divided into four groups:

    1. Subject descriptors – natural language terms describing an object, e.g., "oxygen consumption," "growth rate," "water temperature," or "mercury concentration."
    2. Geographic descriptors - controlled geographic terms describing an object. These terms may not be hierarchical. For example, if a rule exists, we automatically assign the index terms "Canada, Ontario, Toronto, Don R." when "Don River" appears in the caption. However, if no rule exists, only "Don R." will be assigned.
    3. Taxonomic descriptors – Latin taxonomic terms describing an object, often accompanied by a common name. The terms are controlled, but may not be hierarchical at this time. An example of a taxonomic term is "Esox lucius."
    4. Statistical descriptors - controlled statistical analyses terms describing an object, e.g. ANOVA - Analysis of Variance. View the full list of object statistical terms.

    An additional indexing component is the addition of Classifications - or Categories - to object records, defining the object in terms of format. "Table" is self-evident, but "Figure" can be split into dozens of sub categories, such as "Line Graph", "Pie Chart", etc. View the full list of object categories.

    The Indexing Process

    Indexers identify the key variables (or data) that best describe the data illustrated in the images. For example, they know that the terms along the axes of a graph are important, or the terms that are column/row headers in a table. If the caption has some important terms, they are supposed to capture those as well (e.g. names of organisms, geographic terms, or other subject terms that are important, but may not always be displayed in the actual figure or table).

    To assist the indexers, the entire caption, table and other relevant text is sent through some automated indexing routines which match terms in the caption, for example, with terms in our controlled vocabularies. Any of the matches may be useful for the index, although these terms may be removed at the discretion of the indexers thus providing a natural language index to the images.

    Q: Is CSA using a controlled thesaurus to index CSA Illustrata: Natural Sciences?

    A: For the most part, CSA Illustrata uses Natural Language to index the tables and figures. This allows a researcher better recall when using vocabulary specific to his or her field. However, in specific instances such as taxonomic and geographic indexing, a controlled vocabulary is applied.

    Q: What "pick lists" does ProQuest CSA use for CSA Illustrata indexing?

    A: A "pick list", is a set of directories and subdirectories used to organize data. These have the advantage of pre-collecting a set of resources about a topic. They are often a quick and reliable way to find a starting point for your research.

    The indexers of CSA Illustrata have 2 "pick lists" of terms to choose from which they use to classify the Object Categories and Statistical Terms.

    Around 30% of objects in the database fall under the category of table. The remaining 70% fall under the broad heading of "figure". The 5 main Object Categories for figures are: Graph, Illustration, Map, Photograph and Transmission/Emission Image. Each of these categories are then subdivided further (except for Transmission/Emission Image) allowing you to be very specific, or quite broad in the type of figure you are searching for. View the full list of object categories. Each level of the hierarchy is indexed so an individual record could have all three levels represented in the category field (M1).

    There are over 140 different Statistical Terms in use within CSA Illustrata. View the full list of object statistical terms. You can search for a specific technique by using the field code for the field Object Statistical Terms, Q8. This is most easily done in the Advanced Search, under the Tables & Figures tab. Enter a term exactly as is appears on the list or a unique word from a term.

    Q: What is the benefit of Natural Language Indexing?

    A: Natural Language indexing can assign an unlimited number of free text terms to a given table or figure. This allows researchers to locate objects using terms they use in day-to-day research.

    Q: What kind of metadata are we attaching to the object?

    A: Attached to each object will be a number of different sets of metadata. See a list of all possible fields and explanations in the table below:

    Field name:

    Label

    Examples*:

    Accession Number

    AN=

    AN=301-0001107432

    Affiliation

    AF=

    AF=Roseland Observatory

    Author

    AU=

    AU= Thelen Giles

    Caption

    C1=

    C1=( Salmon River Basin) and (water temperature)

    Category

    M1=

    M1=line graph

    DOI

    DO=

    DO=10.1605/01.301-0001070647.2006

    ISSN

    IS=

    IS=0036-8075

    Object DOI

    OI=

    OI=10.1605/01.301-0000094637.2005

    Object Descriptors:

    OD=

    OD=Absorption units

    Object Geographic Terms

    Q7=

    Q7=USA, Maryland

    Object Statistical Terms

    Q8=

    Q8=Standard Deviation

    Object Subject Terms

    Q5=

    Q5=climate sensitivity

    Object Taxonomic Terms

    Q6=

    Q6=Algae

    Publication Year

    PY=

    PY=2007

    Publisher

    PB=

    PB=Blackwell Publishing

    Taxonomic Terms

    TX=

    TX=

    Title

    TI=

    TI= Solar eclipse: Testing IR flux during solar eclipse

    *Examples are not all taken from the same record

    Q: What browsable indexes exist for this product?

    A: There are four browsable indexes for CSA Illustrata: Author, Journal Name, Category and Object Descriptors.

    The Author and Journal name indices can be used to identify authors or journals included in CSA Illustrata: Natural Sciences.

    The Category index highlights around 60 different options you can use to limit or specify the particular type of image you wish to locate

    The Object Descriptors index allows you to search through the Natural Language index terms used against any/all of the objects contained within CSA Illustrata: Natural Sciences.

    This last index, the Objects Descriptors Index, may be particularly useful because it is an alphabetical list of all the terms from each of the more specific fields:

    Object Geographic Terms, Q7=
    Object Statistical Terms, Q8=
    Object Subject Terms, Q5=
    Object Taxonomic Terms, Q6=

    Q: Do we include the specific page number of the article in which the Object may be found?

    A: Yes, you can find the specific page number of the article the object was found on both within the caption attached to the object itself, and as a separate entry on the object record page, under the source field.

    Keeping this information stored on the caption within the object means that even if you copy the object itself from CSA Illustrata you will always be able to easily locate the source information for this object.

Content

    Q: How many articles and objects are indexed in CSA Illustrata: Natural Sciences?

    When CSA Illustrata: Natural Sciences was launched in January, 2007, it contained over 165,000 articles and nearly one million objects.

    Q: Which journal titles are included in CSA Illustrata?

    A: As of February 2007 there are over 1,100 journal titles included in CSA Illustrata: Natural Sciences. The journals are indexed from cover-to-cover, and the current and pending serials source lists will be linked from the product factsheet.

    Q: How significant or important are these titles?

    A: Much of the initial content came from our major development partner, Blackwell Publishers, a respected name in scholarly communication. Within the first few months of product launch, ProQuest CSA had additional agreements for the content of at least 30 additional publishers. We will be building the core content of CSA Illustrata: Natural Sciences over the next three years, and users can expect to see major publishers, important journal titles, and the quality content they have come to expect from ProQuest CSA.

    Q: What subject areas are covered?

    A: CSA Illustrata: Natural Science covers a wide variety of journals from subject areas such as:

    Agriculture

    Forests and Forestry

    Biology

    Geography

    Conservation

    Medical Sciences

    Earth Sciences

    Meteorology

    Education

    Pharmacy and Pharmacology

    Environmental Studies

    Public Health and Safety

    Fish and Fisheries

    Veterinary Science

    Food and Food Industries

    Water Resources

    Q: Is there a stated acquisition policy for the file?

    A: The acquisition policy for this very new product is to add numerous titles in broad subject areas in the natural sciences from multiple publishers. Blackwell Scientific was our development partner (contributing over 800 titles in all disciplines from their Synergy collection). The publishers with whom CSA had signed agreements at the time of launch were:

      Akadémiai Kiadó
      BioOne
      BioMedCentral
      Blackwell Publishing
      Geological Soc. of America
      IOS Press
      Oxford University Press
      Nat’l Res. Council Canada
      PLoS
      Springer-Verlag
      Taylor & Francis
      Walter de Gruyter

    CSA has signed a very recent agreement with Elsevier to add a subset of their titles to this new product.

    Q: What plans does ProQuest CSA have for expanding coverage? How many journals do we plan to cover?

    A: We are planning to continue growing the number of objects within the database by approximately 150,000 objects per month.

    The majority of content is from 2000 onwards with approximately 9,000 objects from publications older than this. This number will continue to grow as older content is added at the same time as we continue to add newer content.

    In terms of the number of objects included in the database by the end of 2007 we plan to have doubled the current content to two million and then three million by the end of 2008.

    Q: Are we including tables and figures from only those publishers with which we have formal agreements?

    A: Formal agreements with publishers means ProQuest CSA is allowed to display the table or figure in CSA Illustrata: Natural Sciences at a number of different levels:

    • Pinkynails on the search results screen
    • Thumbnail images in the abstract and on the tables and figures tab
    • Large size image on the object record

    If a formal agreement with a publisher is not in place, we are not able to show the larger versions of the object. However, our policy to keep these instances to no more than 4% of all objects in CSA Illustrata: Natural Sciences.

Searching, Display, Linking & Rights

    Q: Does CSA Illustrata search the full text of the articles?

    A: CSA Illustrata does not search the full text of the articles. Instead, it enables precision searching by searching the text and data surrounding tables and figures. It is able to search:

    • The caption of the image
    • The image category (graph, satellite image, etc)
    • Terms used in the deep indexing of the document: these include subject, taxonomic, geographic and statistical descriptor terms taken from the image caption, data variable labels and surrounding text
    • Units for subject variables.

    Often, databases that search the full text of an article are not able to search the tables and figures, as the text in tables and figures form part of an image.

    Q: Can we search CSA Illustrata: Natural Sciences both by itself and combined with other content?

    A: Yes. Login links can be created to CSA Illustrata: Natural Sciences, or the database can be selected from the databases page. Additionally, it will be invoked for subscribers when any database from the Natural Sciences area is searched. Note that some databases span disciplines, so CSA Illustrata: Natural Sciences may be invoked when searching another subject area, e.g., Social Sciences, if the institution subscribes to one of these 'spanning' databases.

    Q: Can we search by coordinates of the X and Y axis for tables, charts and graphs?

    A: We do not extract coordinate data when indexing objects at this time. We do, however, extract and index the text describing the axes whenever possible.

    Q: Is retrieval different using Natural Language Index terms instead of controlled vocabulary terms?

    A: When using Natural Language Index terms, it is not necessary for the searcher to be familiar with the controlled vocabulary terms for the particular database, and is therefore more likely to retrieve relevant results when using search terms that are familiar and used on a regular basis. Natural Language Index terms (aka free text terms) use the author’s own words as index terms, rather than assigning pre-determined or pre-existing controlled vocabulary that ultimately may not be as precise as the natural language terms. With regard to CSA Illustrata: Natural Sciences, the use of searchable Natural Language Index terms is of great benefit because of the inclusion of terms found in the title of the X- and Y-axis, and within the caption via the deep indexing process.

    Q: What are those colored borders around the pinky and thumbnail images?

    A: The colored borders in the result display indicate tables or figures that match the user's search terms. This alerts users to the fact that images are included in the record and allows them to quickly determine the relevance of a result to their search.

    Q: On the results page under the Tables & Figures tab, what is the significance of the additional tables and figures tab breakdown?

    A: Approximately 30% of the objects in the database are tables, while the remaining 70% fall under the broad heading of "figure". The Figure category is divided into Graph, Illustration, Map, Photograph, and Transmission/Emission Image. View the full list of object categories.

    Q: Can we link to the full text or OpenURL from CSA Illustrata: Natural Sciences?

    A: Yes. The CSA Illumina Administrative Module includes the "Resource Options" tab for CSA Illustrata subscribers. This tab allows the selection of more than 880 titles for linking. If the library subscribes to other full-text resources, the library may enable those resources in the “Full-Text” tab or the OpenURL options in the Administrative Module to provide linking to the resources. Existing full-text linking and OpenURL settings will apply to CSA Illustrata as it does to all other CSA databases subscribed to.

    Q: Are COS: Scholar Universe records linked to CSA Illustrata: Natural Sciences?

    A: While an author may have a Scholar Profile and records in the CSA Illustrata: Natural Sciences database, there is no icon displayed and no direct correlation between the two databases at this time. Also, CSA Illustrata: Natural Sciences records are not included in the Scholar Profile Selected Publications list.

    Q: Save/Print/Email options provide links to images from Object records. Are these links persistent?

    A: Yes, using any of the Save/Print/Email options from either the abstract record or the object record itself in CSA Illustrata: Natural Sciences will record a persistent link for you back to the object itself. Clicking on this link will take you straight to a web page where just the image itself is displayed for you. This link will work for anyone who has authentication rights to gain access to CSA Illustrata.

    Please note that it is important you always use the link from the Save/Print/Email option itself. Once you click on this link if you were to then copy and paste the URL on the webpage of the image, this link will contain a session ID (represented with the sessid=xxx) number in the URL – this link is not a persistent one and will not work if you sent this to someone else as the session would be expired, therefore you must always use the URL link in the original exported record.

    Q: Can we use save object image files from CSA Illustrata: Natural Sciences?

    Through the Save, Print, Email option on the CSA Illumina platform, records for individual objects (or references with multiple objects) can be saved, printed or emailed. Each record then provides a persistent link back to the image in CSA Illustrata: Natural Sciences.

    Q: Can we use the object image file even if we don't subscribe to the journal?

    A: Each object in CSA Illustrata: Natural Sciences belongs to the original document publisher, and is subject to that publisher's permissions. Publisher details are available in the Publisher field of the Object Record. Additionally, each object is marked with an attribution that includes the publisher name. In the future, we will provide a link from each Object Record back to a general rights page or, when available, to an individual publisher's rights page.

    Q: Can we export Tables and Figures to Excel or PowerPoint?

    A. Objects can be saved to your PC and imported into production software such as MS PowerPoint, MS Word, or MS Excel as image files. Some production software may also allow you to 'drag and drop' object image files from the Web page to the application.

    We are currently investigating additional functionality which would allow some table data to be exported to MS Excel.

    **Note: In all cases, the user is bound by the copyright law as it applies to the article from which the object was extracted.

    Q: Do we have to subscribe to the electronic journal to see the full text?

    A: Yes, this is generally the case. Through the CSA Illumina Administration Module, each institution subscribing to CSA Illustrata has various options when deciding how they will link to full text. If your library has 'turned on' any of these options, you may be able to link out to the full text of the document.

    Q: Will the object image file be exportable to RefWorks in the future?

    A: Through either using the RefWorks button or the Save, Print, Email option on the CSA Illumina platform, records for individual objects (or references with multiple objects) can be easily exported to RefWorks. Each record provides a persistent link back to the image in CSA Illustrata. We have begun exploring the technical issues of exporting the object image file along with the citation, but no release date for that enhancement has been set.