Indexing Offline

From GCD
Revision as of 23:58, 8 January 2011 by Handrews (talk | contribs) (Add instructions for format conversions.)
Jump to navigation Jump to search

1.1: What is indexing Offline?

In the old days, before there was an Online Interface to index comics for the GCD, all indexes had to be done offline. The offline versions of the indexes consisted of text files with tabs that were then compiled together and distributed. File-submission indexing let indexers prepare index files offline using spreadsheet or text editor programs, and submit these files to the GCD as e-mail attachments.

Indexing Offline, or Flat-File Indexing as it often was called, is currently supported on an issue basis.

1.2: How do I use it? How do I reserve comics to index Offline?

Use the Online Interface at http://www.comics.org as usual. If you want to index a comic, find out if the comic is available for indexing by searching the database. To use the Search bar at the top of any page, enter the name of the comic and search for “Series.” A tip is that it is often best to use only a simple part of the name, because this search is by exact match only. Thus, a search on “The Muppet Show Comic Book” will NOT return a series named “Muppet Show: The Comic Book.” A better search would be simply “Muppet Show” and then choose the appropriate series from the list returned. You should note that the GCD uses the official name of a publication as printed in its indicia or publication data within the book: this may not be the same as the name on the cover.

If you find the Series you wanted, look in the "Index Status Grid" on the Series page for the issue you want. If the issue you want to index is shown in white in the grid, then it should be available for you to make a reservation.

If you do not find the comic you want, one of the following may be missing:

If you have navigated to the series you wanted, you may reserve the issues you need by clicking through to them on the grid and then clicking the "Edit" button in the top right corner of any Issue page. All aspects of that issue are then available for you to index.

After you click on "Edit" for an issue you can always import a file with additional stories. If the issue has no stories attached you can also import a file with information for the whole issue.

1.3: What is the format for indexing Offline?

Note that while data is sometimes optional, the format for the data is not. And if a field is left blank you still have to include the empty field (i.e., have consecutive tabs) so the other fields match up (note that some software won't do this if you try to save a text file, so check carefully).

Many find software for word processing like Word from Microsoft or Writer from OpenOffice not that useful for preparing the flatfile due to the handling of tabs. Better suited are spreadsheets, e.g. Excel from Microsoft or Calc from OpenOffice (multi-platform) / NeoOffice (Mac OS X), and an export of the spreadsheet into tab-separated files. Or one can use an editor which can explicitly show tabs while editing a simple text file.

The following table shows the function of the 11 fields for issue records.

  1. number
  2. volume
  3. indicia publisher
  4. brand
  5. publication_date
  6. key_date
  7. indicia_frequency
  8. price
  9. page_count
  10. editing
  11. ISBN
  12. notes

So the line would look like (here and in the following ^T represents a tab)

number ^T volume ^T indicia publisher ^T brand ^T publication_date ^T key_date ^T indicia_frequency ^T price ^T page_count ^T editing ^T ISBN ^T notes

The following table shows the function of the 16 fields for story records.

  1. Story Title
  2. Type
  3. Feature
  4. Page Count
  5. Script
  6. Pencils
  7. Inks
  8. Colorist
  9. Letterer
  10. Editor
  11. Genre
  12. Character Appearances
  13. Job Number
  14. Reprint info
  15. Short Synopsis of Story
  16. Notes

Story Title ^T Type ^T Feature ^T Page Count ^T Script ^T Pencils ^T Inks ^T Colorist ^T Letterer ^T Editor ^T Genre ^T Character Appearances ^T Job Number ^T Reprint info ^T Short Synopsis of Story ^T Notes

If you import a full issue the first line is for the issue followed by several lines for the story sequences.

An example for a whole issue import: (download example)

1 ^T ^T Bildschriftenverlag ^T ^T ^T ^T ^T 0.60 DEM ^T 36 ^T ? ^T ^T Informationen vom Bildschriftenarchiv.
^T cover ^T ^T 1 ^T none ^T ? ^T ? ^T ? ^T none ^T none ^T ^T ^T ^T from Turok, Son Of Stone (Gold Key, 1962 series) #30 ^T ^T
Das verlorene Tal ^T story ^T Turok ^T ? ^T ? ^T Alberto Giolitti ^T Alberto Giolitti; Giovanni Ticci (Assistent) ^T none ^T gesetzt ^T none ^T ^T ^T ^T from Turok, Son Of Stone (Gold Key, 1962 series) #30 ^T ^T
Ein Tag im Leben eines Dinosauriers ^T story ^T Junge Erde ^T ? ^T ? ^T Rex Maxon ^T Rex Maxon ^T none ^T gesetzt ^T none ^T ^T ^T ^T from Turok, Son Of Stone (Gold Key, 1962 series) #30 ^T ^T
Beute der Fleischfresser ^T story ^T Turok ^T ? ^T ? ^T Alberto Giolitti ^T Alberto Giolitti; Giovanni Ticci (Assistent) ^T none ^T gesetzt ^T none ^T ^T ^T ^T from Turok, Son Of Stone (Gold Key, 1962 series) #30 ^T ^T

If one just adds sequences to an issue the first line cannot be there, otherwise it is the same.

Several additional things to consider:

  • on the issue line or on a story line
    • If the Page_Count is uncertain please add a question mark '?' behind it
  • on the issue line
  • on the story line
    • If a credits field is not applicable for the story (e.g. for a black and white story there is no colorist) please enter 'none' for that field.
    • If the story title is made up please put it in []-brackets.
    • Type has to be one from our list of Types

1.4: How do I save in the right format from...

1.4.1 Microsoft Excel 2010 (Windows)

  • Click on the File tab at the upper left of the window
  • Choose "Save As" (near the top left)
  • Select "Text (Tab delimited)" for the "Save as type" and click "Save"
    • You may want to first change the file name to end in ".tsv", but this is not required and has no effect on the contents
  • If Excel complains that the selected file type does not support multiple sheets, then choose "OK" to save only the active sheet
  • If Excel complains that the sheet may contain features that are not compatible with Text (Tab delimited), choose "Yes" to keep the format anyway

Note: Quotes may or may not be handled correctly in this export format- if you have used it and can confirm that this works, please edit this page or let us know.

1.4.2 OpenOffice

OpenOffice is a free software suite for Linux and other UNIX-compatible systems.

If you have instructions for OpenOffice, please edit this page or let us know! We suspect that the instructions for NeoOffice should be similar, so in the meantime please try those

1.4.3 NeoOffice (Mac OS X)

NeoOffice is a free version of OpenOffice developed for the Mac. These instructions were written from version 3.1.1 patch 0

  • From the File menu, choose "Save As"
  • Select "Text CSV (.csv)" from the list of formats
    • You might want to change the file name so that it ends in ".tsv" since we will actually be saving as tab-separated, but this is not required and has no effect on the contents.
  • If you are warned that you will lose formatting or other aspects if you save as .csv, select "Keep Current Format".
  • Next, you will be asked to set export options. Choose the following:
    • Character set: Unicode (UTF-8)
    • Field delimiter: {Tab}
    • Text delimiter: [leave this blank- you will need to highlight the contents of the field with your mouse and press delete to clear it manually, as there is no drop-down option for a blank text delimiter]
    • Save cell content as shown: leave checked
    • Fixed column width: leave unchecked

The resulting file (whether it ends in .csv, .tsv, or anything else) should be in the correct format.

1.4.4 Google Docs

These instructions are correct as of 8 January 2011.

  • Go to the File menu and hover over "Download as" which pulls up a sub-menu
  • From the sub-menu, choose "Text (current sheet)"
  • The resulting file will be named with a ".tsv" extension and will be in the correct format