PURPOSE
: JT-Extract allows the extraction of
fields and keys from a tab delimited file.
WHY
: Why indeed! Suppose you start with
a invoice file in VFC format. First you convert
it with JT-Convert to a tab delimited file and
expand the VFC codes. At this point the records
are too big to presort. Next you use JT-Extract
to extract the key (e.g. invoice #) and address
data. You then produce both an Addressfile and
a Datafile, both prepended with a key field.
You
can then presort the Addressfile. In Jet Letter
2000, you can use the key field to look up one
or multiple data file records (pages) for that
key and print the invoice pages.
Other
applications might require that you also might
want to print records in a different order. You
might extract the zip code field. JT extract would
create the Key file contains fields for zip code
and record count.
You
have the option of sorting the key file various
ways (See Below).
If
the Keyfile option is specified, a key file is
also generated with 2 fields. They are the unique
keys and the number of unique keys.
If
the SORT option is specified, the keys in the
keyfile will be sorted up or down based on either
the key or the count.
This
allows Jet Letter to use Keyfile to be used to
access the date in the address file or the date
file in the order of the key file.
By
making the first key the record # and sorting
backward, Jet Letter could print the data file
backward. By making the first key the customer
name or id, you could cluster all records for
each customer.
By
making the first key the #of pages/record, you
could cluster multiple page groups. (e.g. get
all the 2 page records together). By making the
zip code the first key, you could cluster all
the records into zip code or reverse zip code
order. The count of unique keys can be used to
put out a nice report of zipcode counts, # of
multipage clusters, look for duplicates, etc.
USAGE
: JT-Extract Infile Fieldfile Addrfile
Datafile [Keyfile] [Sort]
Infile
- This is an input file, tab delimited,
containing source data.
Fieldfile
- This is an input file, containing field descriptors.
Field
descriptors are 3 numbers or a # symbol. The three
numbers may be separated by a blank(s), tab, or
-.
The
special field description "#" captures
the sequence
number of the record.
The
three numbers are the field # in the input record,
the starting column # and the length of the
data to extract.
Column numbering and field numbering start at
1. Length may
be overlong without error. (e.g. 9999);
The
first field described is prepended to the Addrfile
and the Datafile records. This
allows Jet Letter to access the Datafile records
using the first Addrfile field as a key. Successive
fields described are appended to the Addrfile
records. This generally will be the mailing
fields.
The result is an address file suitable for address
presorting.
Addrfile
- This is an output file, tab delimited, that
will contain the extracted fields
prepended with the key field in tab delimited
format.
The key may be used with Jet Letters Lookup functions
to access data in the Datafile.
Datafile
- This is an output file, tab delimited, that
will contain the Infile fields
prepended with the key field.
Keyfile
- This an optional output file, tab delimited,
with two fields.
The first field is the key, the second field is
the count of records with that key. If
unsorted, this file is in the order of the first
appearance of the key.
Sort
- This is an optional parameter that allows
sorting the Keyfile. The options are:
[ORIGINALUP|ORIGINALDOWN|KEYUP|KEYDOWN|COUNTUP|COUNTDOWN].
If this field is not specified the key file
will be in NATURAL order. The sort is stable,
i.e. for any given key, the order will be in
the original Infile order
CAUTION
: When generating the Keyfile:
Sort
by a "#" or sort by relatively small
number of
non-unique text keys is very fast.
Sort
by text key (not # key), where there are a
large number of different keys, may require
a dramatically longer time to process. This
caused by the need to insert, in order, unique
keys for each record.
A
100,000 record file may go from a few seconds
to a few minutes