• Antelope Release 5.5 Mac OS X 10.8.5 2015-04-21

 

NAME

db2segy - conversion from css3.0 traces to segy disk image

SYNOPSIS

db2segy db outfile [ [-SU|-V SU|-V 0|-V 1] -ss subset -pf pffile -v -d desc]

SUPPORT


Contributed code: NO BRTT support.
THIS PIECE OF SOFTWARE WAS CONTRIBUTED BY THE ANTELOPE USER COMMUNITY. BRTT DISCLAIMS ALL OWNERSHIP, LIABILITY, AND SUPPORT FOR THIS PIECE OF SOFTWARE.

FOR HELP WITH THIS PIECE OF SOFTWARE, PLEASE CONTACT THE CONTRIBUTING AUTHOR.

DESCRIPTION

db2segy is a fairly flexible program for producing a segy tape image file from data stored in a css3.0 database. The output is known to be readable by ProMAX distributed through release 98 of Landmark Graphics, but it should work correctly with any processing system with a SEGY reader that accepts IEEE float data (not part of the original SEGY standard). This program was built with a strong prejudice that the only rational use of this converter would be to collect a group of traces from a database and output a suite of traces that look like shot gathers. The start times of each of the pseudoevents assembled this way are passed to the program through standard input and a set of fixed length traces are produced with a constant number of channels per gather as required by the segy standard.

An exception to this is that one can alternatively map all data to look like a stacked cdp section. In this case, the source and receiver coordinates are set equal and individual events end up being set to look like single-fold, zero offset reflection data.

This program has several bells and whistles controlled by the input parameter file. The optional parts of this are described below, but one key design feature needs to be recognized here. The parameter file is used to specify a list of station/channel pairs that define the output channel order of the segy tape image. Because multichannel shot data always have fixed numbers of channels per shot file, this is forced into the output by this list. That is, each station/channel listed is written in the position defined by the parameter file (see below) with the first entry in the list being channel 1 and the last entry in the list defining the number of channels per gather. When the program actually runs station/channel pairs not found in the specified time period are silently flagged as "dead" in the SEGY headers for that channel number and the corresponding trace is filled with zeros. A special case produces different behaviour. That is, some people have used this program to produce data effectively sorted into common receiver gathers by specifying the array consists of of one and only one stations. This creates an anomaly in dealing with dead channels. In this case, data for time periods when a station had no data are simply skipped BUT the field field (ffid = event number) is incremented. Because the ffid field in this program is derived from the input shot list by counting one can then normally sort out the geometry correctly since all systems I've ever used utilize ffid as a unique key for shot data. A corollary of this for ALL uses of this program is that it is wise to use the same shot list for all data on a single experiment or you may have a hard time working out the geometry.

There are two ways to set source geometry information in the segy headers. One mechanism is to use the extension "shot" table described in more detail below. For most user a more convenient method is to specify these coordinates as part of the input stream. This is controlled by the input boolean input_source_coordinates in the parameter file. When set true the program will assume that the input lines that drive the program are of the form:

  time  ffid x(lon)   y(lat)   elev
where the time field string must NOT be specified in a format with BI spaces (e.g. 2009001:01:22:22.405 is acceptable but "1/1/2009 01:22:22.405" will not work). The time string is passed directly to the str2epoch(3) procedure so any string (without blanks) that can be cracked by str2epoch should work. The ffid is used to set the field file number (ffid = fldr in Seismic Unix) AND the energy point number (ep in Seismic Unix). Note this supercedes the counting method described above to set ffid if this format is not used. i.e line count is ignored when running in this mode and this value is used to set ffid. In general this should make indexing easier for most likely uses of this program as ffid is the lowest common demoninator in reflection data.

This program handles source and receiver coordinates according to the SEGY standard, but with some limitations. The first is you either get units of meter, decimal degrees, or arcseconds depending on the setting of the parameter file boolean use_geo_coordinates, prefer_decimal_degrees, and the -V command line option. When the use_geo_coordinates boolean is true you will get output coordinates in either arc seconds (SEG-Y REV0) or decimal degrees (SEG-Y Rev1 only, depending on the setting of prefer_decimal_degrees). Otherwise it will be meter or null.

Source coordinates are taken from the site table. If using geographc coordinates the lat,lon fields are used. If you are using meters the dnorth and deast fields of the site table are used. When using arcseconds, be warned the standard schema definition for Antelope for site coordinates will lead to a loss of accuracy for receiver coordinates as the number of digits stored is only sufficient for approximately 100 m accuracy. If you need better resolution you will need to mess with the header data yourself.

Source coordinates can optionally be passed to the program through the standard input stream. This is controlled by a boolean variable input_source_coordinates.. Fields read as x=lon and y=lat are blindly converted depending upon the setting of use_geo_coordinates. When true the input coordinates are treated as decimal degrees and converted to arcseconds if necessary, otherwise they are assumed in units of meters (e.g. UTM coordinates).

The parameter coordinate_scale_factor is treated as described in the standard with some restrictions. Specifically, it is required to be a multiplier so if a number smaller than one is not allowed. Internally the standard then requires this number to be converted to a negative value and stored at offset 71-72 in the segy header. Thus, all coordinates stored in the header are multiplied by this value. To restore them to meters divide by this number. To restore decimal degree coordinates from arcsecond coordinates, divide header values by (coordinate_scale_factor * 3600).

db2segy allows one to optionally rotate data to any specified orthonormal coordinate system. This makes sense, of course, only with three component data and rational channel codes (e.g. EHZ, EHN, and EHE) AND when the sitechan table is complete and correct. The program will almost certainly die with a diagnostic if you attempt to rotate data that does not include a complete set of three components. Because it uses rotate_to_standard in the trace library it will also have problems if the input database has multiple channel codes for that station. If you have multiple channel code data, you should subset the data first before running this program. The output channel order for rotated data is defined in the same way as described above from an output list of station/channel pairs in the parameter file. This allows an important flexibility as three component data can be output in station bundles of three adjacent traces or as three separate groupings in the output shot gather file (e.g. vertical first, followed by radial, followed by the transverse components). Some important details related to the rotation feature are described below that are controlled by the input parameter file specifications. Note, by the way, that segy was not written with three-component data in mind as no orientation information is stored in the trace headers. The orientation mechanism used by processing systems is usually implicit and requires traces to be in a particular order on input. Hopefully this mechanism will cover all the bases.

db2segy treats data gaps in a way that needs to be recognized. It makes an assumption that gaps should be treated as clipped data. This means data in intervals flagged by the trace library routine trload_css as a gap will take on one of two values: the upper or lower data limit defined for the original data type (see trgaps(3)). The "upper" value is set if the last sample before the gap was positive, adn the "lower" value is used if the last sample was negative. This algorithm may give erroneous full scale transients for true data gaps not caused by clipping. An exception to this rule is a "gap" flagged at the front or end of a trace. Because this would commonly happen with variable start times on different traces gaps in the front or end of a trace will be zeroed instead of set to full scale.

OPTIONS

FILES

db2segy expects to see stdin in one of two forms. As noted above when input_source_coordinates is set true, the input format is expect to be in the following order (free format ascii):

time lon(x) lat(y) depth
When false only the first field, the start time, is required. If there is any other data on an input line it will be silently discarded in this mode. In both cases the time field string is passed directly to str2epoch(3) so anything str2epoch can crack should produce the desired results. If in doubt run the program in verbose mode and the times will be echoed with strtime with it's default time string format. How the coordinate fields are handled depends on the setting of use_geo_coordinates. When true, coordinates are assumed to be longitude and latitude in degrees. Otherwise they are tacitly assumed to be meters in some local coordinate system.

Standard output lists the output station/channel order and echoes trace channel and index number as conversion progresses.

The output file passed as argument two will silently overwrite and existing file if one by the same name already exists. This file is a segy tape image. The 3200 byte EBCDIC reel header is written as a block of pure nulls. The binary reel header is filled in and written immediately after the EBCDIC section as required by the standard. The trace data follow.

An optional extension table to css3.0 called segy1.0 can be used to set the source coordinates in the segy header. It defines a table called shot that is used to set the source coordinate fields in the segy header. The program will attempt to open this table ONLY when input_source_coordinates is false. In that mode (false) the program assumes all coordinates are in a local coordinate system with units of km set in the dnorth and deast fields of the shot and site tables. (These are implicitly assumed to be consistent) If this table is not present when running in local coordinate mode, db2segy will silently leave the source coordinate fields in the segy header null.

PARAMETER FILE

The main controlling input for this program enters through a parameter file. It contains five type of parameters: (1) basic scalar parameters required by the program; (2) parameters related to three-component rotation; (3) output channel order definition; and (4) database table parameters. The following divides the parameters this way.

Basic Scalar Parameters

sample_rate defines the fixed sample rate in sample per second. All data must have the same sample rate (a SEGY limitation). Any traces that do not match the sample rate defined by this parameter will be skipped with an error message logged.

trace_length length of ALL output traces in seconds.

map_to_cdp Boolean variable. When true the program sets header variables to make the data look like stacked cdp data instead of shot gathers (the default behaviour).

The boolean input_source_coordinates, coordinate_scale_factor, use_geo_coordinates, and prefer_decimal_degrees along with the -V command-line option work together as described above. I emphasize that when use_geo_coordinates is false receiver coordinates are extracted from the dnorth and deast fields of the site table and written in the headers in units of meters. When input_source_coordinates is true the coordinates are treated as geographic if use_geo_coordinates is true, but are written verbatim if this is false. The coordinate_scale_factor is applied to ALL coordinate values as specified by the standard if input_source_coordinates is true. A reasonable scale factor is automatically chosen if input_scale_coordinates is false. Note that the scale factor is always a number greater than one for this program and is used as a mutiplier. Be aware that because the standard says a mutiplier should be specified negative this attribute (stored in byte offset 71-72 in the SEG-Y trace headers) will always be negative when written by this program.

For SEG-Y output versions 0 or SU, the boolean use_32bit_nsamp can be used if very long record lengths are desired. The segy standard stores the number of samples field in a 16 bit integer in both the reel and trace headers. If set true, long record lengths will be handled and an extension field (num_samps in the PASSCAL SEG-Y extension definition), which is an 32 bit integer field, is used to store nsamp. The regular nsamp field is simply silently truncated using a cast to a 16 bit field. Use this feature with caution. Note that this option is ignored for SEG-Y Rev1 because the PASSCAL extension field conflicts with a new field in the Rev 1 standard.

The integer trace_gain_type overrides the "gain type of field instruments" field (bytes 119-12) in the trace header. 0 = unknown; 1 = fixed gain; 2 = binary; 3 = floating point; 4...N user defined. The default is 0 (unknown) to maintain compatibility with previous versions of this program. 1 (fixed) is probably a more sensible default.

The string text_header_description controls the contents of the first record of the textual file header. It is automatically truncated at 76 characters due to limitations of the file format, and is overridden by the -d command-line option.

Rotation Parameters

rotate is a logical that turns the rotation feature on and off. If rotate is set false other rotation related commands will be ignored. Note also that attempting to output rotated channels (see below) will, of course, either produce garbage or cause the program to die.

phi and theta are spherical coordinate angles that define how the standard E,N,Z coordinate system will be rotated on output (see trrotate(3) for a more extensive description. These parameters are passed directly to the trrotate.)

Channel order definition

Channel order definitions are controlled by a &Tbl tagged with the keyword "channels". The lines below the &Tbl{ tag should consist of a series of valid station channel pairs (blank separated -- see example below) for the data being converted. The data will be written in the same order as this list (top will be channel 1).

Rotated data are handled by special unalterable channel codes. Specifically use Z, R, and T as channel codes to output vertical, radial, and transverse components respectively as defined by your transformation. The definitions of these direction is, however, intimately related to the transformations defined in trrotate(3). First, the program calls rotate_to_standard to produce output traces tagged with channel codes X1, X2, and X3. The "standard" used is that X1 is +east, X2 is +north, and X3 is +up. This transformation is essential since data often have polarity differences from the standard and/or simple field setup errors. The program next calls trrotate using the angles phi and theta (see above). The best way to think of the results is how the X1,X2, and X3 coordinate system would be changed if rotated by spherical coordinate angles phi and theta. At the end of that transformation R is the transformed X1, T is the transformed X2, and Z is the transformed X3.

Note you can actually request the data transformed to "standard" coordinates by setting rotate to true and asking for channels X1, X2, and X3 instead of the original channel codes.

Database Table Parameters

join_tables is a &Tbl object that contains a list of database tables and the order they are to be joined when the program opens the input database. Two tables are absolutely required in this list -- the program will die if they do not appear in the list. They are: wfdisc and site. In addition, although sitechan is not totally required, the program will produce garbage if three-component rotation is attempted and sitechan is not listed in this table. Finally, note that after the receiver coordinates placed in the SEGY header come the dnorth, deast fields of site.

Most users are unlikely to need to alter the default parameter file for this list. There is one special add on table that is commented out in the example below. This table called "shot", which is an extension to css3.0. If the "shot" line appears here, db2segy looks for a database table called shot. If it cannot find it defined in the schema it will be ignored. If it is defined the shot table will be used to set the source coordinate information. Provided the table joins correctly, the only information that the program attempts to extract from the shot table are the dnorth, deast, elev, and edepth fields. Other tables to set other parameters could be defined by a similar mechanism in datascope, but in this version only the "shot" table extension will work.

EXAMPLE

sample_rate 250
trace_length 5.0
rotate yes
# This set of parameters are only hit when rotate is turned on.
phi 80.0
theta 0.0
# end rotate parameters

#
#  This form outputs rotated channels
#
channels &Tbl{
100 Z
101 Z
102 Z
103 Z
104 Z
105 Z
106 Z
107 Z
108 Z
109 Z
110 Z
100 N
101 R
102 R
103 R
104 R
105 R
106 R
107 R
108 R
109 R
110 R
100 T
101 T
102 T
103 T
104 T
105 T
106 T
107 T
108 T
109 T
110 T
}
#
#  This is the pattern to use normal channel codes.
#  They are commented out for this example.
#
#channels &Tbl{
#100 EHZ
#101 EHZ
#102 EHZ
#103 EHZ
#104 EHZ
#105 EHZ
#106 EHZ
#107 EHZ
#108 EHZ
#109 EHZ
#110 EHZ
#100 EHN
#101 EHN
#102 EHN
#103 EHN
#104 EHN
#105 EHN
#106 EHN
#107 EHN
#108 EHN
#109 EHN
#110 EHN
#100 EHE
#101 EHE
#102 EHE
#103 EHE
#104 EHE
#105 EHE
#106 EHE
#107 EHE
#108 EHE
#109 EHE
#110 EHE
#}
#
#  This list of tables must at least include wfdisc or the trload_css will fail.
#  It should also normally have site listed second and have dnorth, deast filled
#  in.
#
join_tables &Tbl{
wfdisc
site
sitechan
origin
#shot
}
# trace_gain controls the value of the "gain type of field instruments" field
# (bytes 119-120) in the trace header.
# 0 = unknown; 1 = fixed gain; 2 = binary; 3 = floating point; 4...N optional
# Default is 0 (unknown) to maintain compatiblity with previous versions
# of this program. 1 is probably a more sensible default.
trace_gain_type 0
text_header_description Antelope DB2SEGY

pf_revision_time 1413768151

DIAGNOSTICS

Numerous diagnostics are written using the elog facility that should help in sorting out problems. The list is too long to rationally repeat here.

SEE ALSO

trintro(3), trrotate(3), trload_css(3), pf(3), str2epoch(3),
 and the SEGY standard book.

BUGS AND CAVEATS

AUTHOR

Gary L. Pavlis and Geoffrey A. Davis


Antelope User Group Contributed Software
Printer icon