DC_READ_FIXED Function

Reads fixed-formatted ASCII data using a format that you specify.


0	Indicates an error, such as an invalid filename or an I/O error.
0	Indicates a successful read.

Keywords

Bytes_Per_Rec

A long integer that specifies how many characters comprise a single record in the input data file; use only with column-oriented files. If not provided, each line of data in the file is treated as a new record. For more details about when to use the Bytes_Per_Rec keyword, see

Example 5

Column A flag that signifies filename is a column-organized file.

Dt_Template An array of integers indicating the data/time templates that are to be used for interpreting date/time data. Positive numbers refer to date templates; negative numbers refer to time templates. For more details, see Example 6. To see a complete list of date/time templates, see the PV-WAVE Programmer's Guide.

Filters An array of one-character strings that PV-WAVE should check for and filter out as it reads the data. A character found on the keyboard can be typed; a special character not found on the keyboard is specified by ASCII code. For more details, see Example 2.

Format A string containing the C- or FORTRAN-like format statement that will be used to read the data. The format string must contain at least one format code that transfers data; FORTRAN formats must be enclosed in parentheses. If not provided, a C format of %1f is assumed.

Ignore An array of strings; if any of these strings are encountered, PV-WAVE skips the entire record and starts reading data from the next line. Any string is allowed, but the following three strings have special meanings:


$BLANK_LINES	Skip all blank lines; this prevents those lines from being interpreted as a series of zeroes.
$TEXT_IN_NUMERIC	Skip any line where text is found in a numeric field.
$BAD_DATE_TIME	Skip any line where invalid date/time data is found.

For an example showing how to use the Ignore keyword, see

Example 7

Miss_Str An array of strings that may be present in the data file to represent missing data. If not provided, PV-WAVE does not check for missing data as it reads the file. For an example showing how to use the Miss_Str keyword, see DC_READ_FREE, Example 3.

Miss_Vals An array of integer or floating-point values, each of which corresponds to a string in Miss_Str. As PV-WAVE reads the input data file, occurrences of strings that match those in Miss_Str are replaced by the corresponding element of Miss_Vals.

Nrecs Number of records to read. If not provided or if set equal to zero (0), the entire file is read. For more information about records, see Physical Records vs. Logical Records.

Nskip Number of physical records in the file to skip before data is read. If not provided, or set equal to zero (0), no records are skipped.

Resize An array of integers indicating the variables in var_list that can be resized based on the number of records detected in the input data file. Values in Resize should be in the range:

Resize_n

#_of_vars_in_var_list

For an example showing how to use the Resize keyword, see

Example 4

Row A flag that signifies filename is a row-organized file. If neither Row nor Column is present, Row is the default.

Discussion

DC_READ_FIXED is capable of interpreting either FORTRAN-or C-style formats, and is very adept at reading column-oriented data files. Also, DC_READ_FIXED handles many steps that you have to do yourself when using other PV-WAVE functions and procedures. These steps include: 1) opening the file, 2) assigning it a logical unit number (LUN), and 3) closing the file when you are done reading the data.

If neither the Row or Column keywords are provided, the file is assumed to be organized by rows. If both keywords are used, the Row keyword is assumed.

NOTE: This function can be used to read data into date/time structures, but not into any other kind of structures.

String Resources Used By This Function

Upon execution, the DC_READ_FIXED function examines two strings in a string resource file. These strings, described below, allow you to control how the function handles binary files.

The string resource file is:

: (UNIX)wavedir/xres/!Lang/kernel/dc.ads

: (OpenVMS)wavedir:[XRES.!Lang.KERNEL]DC.ADS

: (Windows)wavedir\xres\!Lang\kernel\dc.ads

Where

wavedir is the main PV-WAVE directory.

The strings that are examined are DC_binary_check and DC_allow_chars.

DC_binary_check This string can be set to the values True or False. If set to True, the data file is checked for the presence of binary characters before the file is read. If binary characters are found, the file is not read. If this string is set to False, no binary character checking is performed. (Default: True)

For example, to turn off binary checking, set the string as follows in the dc.ads file:

DC_binary_check: False

DC_allow_chars

This string lets you specify additional characters to allow in the check for binary files. Before a file is read, the first several lines are checked for the presence of non-printable characters. If non-printable characters are found, the file is considered to be a binary file and the file is not read. By default, all printable characters in the system locale are allowed. Characters may be specified either by entering them directly or numerically by three digit decimal values by preceding them with a "\" (backslash).

For example, to allow characters 165 and 220, set the string as follows in the dc.ads file:

DC_allow_chars: \165\220

How the Data is Transferred into Variables

As many as 255 variables can be included in the input argument var_list. You can use the continuation character ($) to continue the function call onto additional lines, if needed. Any undeclared variables in var_list are assumed to have a data type of float (single-precision floating-point).

As data is being transferred into multi-dimensional variables, those variables are treated as collections of scalar variables, meaning the first subscript of the import variable varies the fastest. For two-dimensional import variables, this implies that the column index varies faster than the row index. In other words, data is transferred into a two-dimensional import variable one row at a time. For more details about reading column-oriented data into multi-dimensional variables, see Example 4 (in the DC_READ_FREE function description).

The format string is processed from left to right. Record terminators and format codes are processed until no variables are left in the variable list or until an error occurs. In a FORTRAN format string, when a slash record terminator ( / ) is encountered, the rest of the current input record is ignored, and the next input record is read.

Format codes that transfer data are matched with the next available variable (or element of a multi-dimensional variable) in the variable list var_list. Data is read from the file and formatted according to the format code. If the data from the file does not agree with the format code, or the format code does not agree with the type of the variable, a type conversion is performed. If no type conversion is possible, an error results and a nonzero status is returned.

Once all variables in the variable list have been filled with data, DC_READ_FIXED stops reading data, and returns a status code of zero (0). This is true even if there are format codes in Format that did not get used. Even if an error occurs, and status is nonzero, the data that has been read successfully (prior to the error) is returned in the var_list variables.

TIP: If an error does occur, use the PRINT command to view the contents of the variables to see where the last successfully read value occurs. This will enable you to isolate the portion of the file in which the error occurred.

If the format string does not contain any format codes that transfer data, an error occurs and a nonzero status is returned. The format codes that PV-WAVE recognizes are listed in the

-WAVE Programmer's Guide

Format Reversion when Reading Data

If the last closing parenthesis of the format string is reached and there are still unfilled variables remaining, format reversion occurs. In format reversion, the current record is terminated, a new one is read, and format string processing reverts to the first group repeat specification that does not have an explicit repeat count. If the format does not contain a group repeat specification, format processing reverts to the initial opening parenthesis of the format string.

For more information about format reversion and group repeat specifications, see the PV-WAVE Programmer's Guide.

Physical Records vs. Logical Records

In an ASCII text file, the end-of-line is signified by the presence of either a CTRL-J or a CTRL-M character, and a record extends from one end-of-line character to the next. However, there are actually two kinds of records:

physical records
logical records

For column-oriented files, the amount of data in a physical record is often sufficient to provide exactly one value for each variable in var_list, and then it is a logical record, as well. For row-oriented files, the concept of logical records is not relevant, since data is merely read as contiguous values separated by delimiters, and the end-of-line is merely interpreted as another delimiter.

NOTE: The Nrecs keyword counts by logical records, if they have been defined. The Nskip keyword, on the other hand, counts by physical records, regardless of any logical record size that has been defined.

Changing the Logical Record Size

You can use the Bytes_Per_Rec keyword to explicitly define a different logical record size, if you wish. However, in most cases, you do not need to provide this keyword. For an example of when to use the Bytes_Per_Rec keyword, see

Example 5

NOTE: By default, DC_READ_FIXED considers the physical record to be one line in the file, and the concept of a logical record is not needed. But if you are using logical records, the physical records in the file must all be the same length. The Bytes_Per_Rec keyword can be used only with column-oriented data files.

Filtering and Substitution While Reading Data

If you want certain characters filtered out of the data as it is read, use the Filters keyword to specify these characters. Each character (or sequence of digits that represents the ASCII code for a character) must be enclosed with single quotes. For example, either of the following is a valid specification:

',' or '44'

Furthermore, the two specifications shown above are equivalent to one another. For more examples of using the Filters keyword, see

Example 2

Example 4

Characters that match one of the values in Filters are treated as if they aren't even there; in other words, these characters are not treated as data and do not contribute to the size of the logical record, if one has been defined using the Bytes_Per_Rec keyword.

NOTE: If you want to supply multi-character strings instead of individual characters, you can do this with the Ignore keyword. However, keep in mind that a character that matches Filters is simply discarded, and filtering resumes from that point, while a string that matches Ignore causes that entire line to be skipped.

So if you are reading a data file that contains a value such as #$*10.00**, but you don't want the entire line to be skipped, filter the characters individually with Filters = ['#', '$', '*'], instead of collectively with Ignore = ['#$*', '**'].

Missing Data Substitution

PV-WAVE expects to substitute a value from Miss_Vals whenever it encounters a string from Miss_Str in the data. Consequently, if the number of elements in Miss_Str does not match the number of elements in Miss_Vals, a nonzero status is returned and no data is read. The maximum number of values permitted in Miss_Str and Miss_Vals is 10.

If the end of the file is reached before all variables are filled with data, the remainder of each variable is set to Miss_Vals(0) if it was specified, or 0 (zero) if Miss_Vals was not specified. In this case, status is returned with a value less than zero to signify an unexpected end-of-file condition.

Reading Row-Oriented Files

If you include the Row keyword, each variable in var_list is completely filled before any data is transferred to the next variable.

The dimensionality of the last variable in var_list can be unknown; a variable of length n is created, where n is the number of values remaining in the file. All other variables in var_list must be pre-dimensioned.

If you include the Resize keyword with the call to DC_READ_FIXED, the last variable can be redimensioned to match the actual number of values that were transferred to the variable during the read operation.

If you are interested in an illustration showing what row-oriented data can look like inside a file, see the PV-WAVE Programmer's Guide.

Reading Column-Oriented Files

If you include the Column keyword, DC_READ_FIXED views the data files as a series of columns, with a one-to-one correspondence between columns in the file and variables in the variable list. In other words, one value from the first record of the file is transferred into each variable in var_list, then another value from the next record of the file is transferred into each variable in var_list, and so forth, until all the data in the file has been read, or until the variables are completely filled with data.

If a variable in var_list is undefined, a floating-point variable of length n is created, where n is the number of records read from the file. To get a similar effect in an existing variable, include the Resize keyword with the function call.

All variables specified with the Resize keyword are redimensioned to the same length the length of the longest column of data in the file. The variables that correspond to the shortest columns in the file will have one or more values added to the end; either Miss_Vals(0) if it was specified, or 0 (zero) if Miss_Vals was not specified.

If you are interested in an illustration demonstrating what column-oriented data can look like inside a file, see the PV-WAVE Programmer's Guide.

Multi-dimensional Variables

The following table shows how column-oriented data in a file is read into multi-dimensional variables:


Dimensions of Variable	How Data is Read From the File(If Variable is Pre-dimensioned)
One-dimensional(1 x n)	One value read from each record of file(repeated n times)
Two-dimensional(m columns by n rows)	m values read from each record of file(repeated n times)
Three-dimensional(m x n x p)	m values read from each record of file(repeated n times) (entire process repeated p times)
q-dimensional(m x n x p x q)	m values read from each record of file(repeated n times)(above process repeated p times)(entire process repeated q times)

You can combine one- and two-dimensional variables in var_list, as long as the second dimension of the two-dimensional variable matches the dimension of the one-dimensional variable. For example, with two variables, var1(50) and var2(2,50), one column of data will be transferred to var1 and two columns of data will be transferred to var2.

NOTE: If you want to intermingle multi-dimensional variables in var_list, you must be sure that the product of all dimensions (excluding the first dimension) of each variable is equal. For example, you can combine two-, three-, and four-dimensional variables in var_list if the variables have dimensions like these:

Example 1

The function call:

status=DC_READ_FIXED('results.wp', /Column,  $
    unit1, unit2, unit3, run_total, Ignore=  $
    ["Total", "------", "$TEXT_IN_NUMERIC",  $
    "$BLANK_LINES"], Format="(F7.2,5X)")

reads the data from file results.wp and places the data into four variables: unit1, unit2, unit3, and run_total.

Because the variables were not predefined, all data is interpreted as single-precision floating-point data, and all variables are treated as resizable one-dimensional arrays. Any blank lines or strings specified with the Ignore keyword (in this example, "Total" and "------") are ignored. Also, any line with non-numeric characters in a numeric field is ignored.

Example 2

The function call:

status = DC_READ_FIXED('yields.doc', intake,  $
    chute, conveyor, crusher, /Column,  $
    Filter=['/', ':', ','],  $
    Format="(F7.2, 8X, F6.4, 3X)",  $
    Ignore=["$BLANK_LINES"])

reads data from the file yields.doc and places the data into four variables: intake, chute, conveyor, and crusher.

Because the variables were not predefined, all data is interpreted as single-precision floating-point data, and all variables are treated as resizable one-dimensional arrays. Any extraneous characters (in this example, "/", ":", and ",") are discarded because the Filter keyword is provided. Also, all totally blank lines in the file are ignored.

Example 3

The data file shown below is a fixed-formatted ASCII file named simple.dat. The `.' characters in simple.dat represent blank spaces:

...1...2...3...4...5...6...7...8...9..10..11..12..13..14..15..16..17..18..19..20

The function call:

status = DC_READ_FIXED('simple.dat', var1,  $
    Format='(I4)', /Column)

results in var1=[1.0, 6.0, 11.0, 16.0]. Because var1 was not predefined, DC_READ_FIXED creates it as a one-dimensional floating-point array.

On the other hand, the commands:

Var1 = INTARR(2)
Var2 = INTARR(2)
status = DC_READ_FIXED('simple.dat', var1,  $
    var2, Format='(2(4X, I4))', Nskip=2)

skip the first two records in the file and result in var1=[12, 14] and var2=[17, 19]. Because neither the Row or Column keyword was supplied, the file is assumed to use row organization.

Example 4

The data file shown below is a fixed-formatted ASCII file; this file is named nimrod.dat. The `.' characters in nimrod.dat represent blank spaces. nimrod.dat is very much like the data file in

Example 3

...1...2...3...4...5...6...7.......9..10..11..12..13..14..15..16..17..18..19..20

When reading this file as column-oriented data, the results vary, depending on whether a C or FORTRAN format string is being used, and whether the Resize keyword has been included in the function call to DC_READ_FIXED.

For example, the commands:

A = INTARR(20) & B = INTARR(20)
C = INTARR(20) & D = INTARR(20)
E = INTARR(20)
status = DC_READ_FIXED('nimrod.dat',  $
    A, B, C, D, E, Format='(2X, I2)',  $
    Resize=[1, 2, 3, 4, 5], /Column)

result in A=[1, 6, 11, 16], B=[2, 7, 12, 17], C=[3, 0, 13, 18], D=[4, 9, 14, 19], and E=[5, 10, 15, 20]. The missing value is interpreted as a zero (0). All variables are resized to a length of 4.

On the other hand, the commands:

A = INTARR(20) & B = INTARR(20)
C = INTARR(20) & D = INTARR(20)
E = INTARR(20)
status = DC_READ_FIXED('nimrod.dat',  $
    A, B, C, D, E, Format='%d',  $
    Resize=[1, 2, 3, 4, 5], /Column)

result in A=[1, 6, 11, 16], B=[2, 7, 12, 17], C = [3, 9, 13, 18], D = [4, 10, 14, 19], and E = [5, 15, 20]. The missing value is skipped altogether, and E is resized to a length of 3 to reflect the number of values that were transferred into the variable. The other variables are resized to 4.

Any variable that is not resizable (because it was omitted from the Resize vector), will be padded to the end with extra values. For the latter of the two calls to DC_READ_FIXED shown in this example, A, B, C, and D would be padded with an additional 16 zeroes, while E would be padded with an additional 17 zeroes. (Zeroes are used for the padding because Miss_Vals was not specified.)

If the file nimrod.dat had used some other character as a delimiter, such as commas or slashes, both the C and FORTRAN format strings would have yielded the same result, namely, C = [3, 0, 13, 18]. It is only because of the way a C format skips over blank space that the C format was unable to detect the presence of a missing value.

Example 5

The data file shown below contains 18 pairs of XY data that could be used to create a scatter plot:

5.992E+04,7.121E-01,8.348E+04,7.562E-01,5.672E+04,9.451E-01, 5.459E+04,8.659E-01,7.088E+04,8.659E-01,8.541E+04,3.437E-01, 4.981E+04,4.679E-01,8.438E+04,5.019E-01,6.902E+04,7.340E-01, 6.239E+04,8.023E-01,7.865E+04,6.643E-01,5.870E+04,9.992E-01, 7.439E+04,9.456E-01,4.672E+04,9.801E-01,6.872E+04,4.325E-01, 6.362E+04,5.894E-01,8.992E+04,7.509E-01,2.785E+04,4.796E-01,

For data organized like this, you use the Bytes_Per_Rec keyword to specify the exact length of the record. In this example, all X values are single-precision floating-point numbers with an exponent of E+04, and all Y values are single-precision floating-point numbers with an exponent of E-01. Therefore, each XY pair uses 18 ASCII characters (bytes) apiece. Thus, you would specify 20 bytes per record (9 times 2, plus 2 more bytes for the comma delimiters separating values).

status=DC_READ_FIXED(/Column, "xy5.dat", Xa, $
    Ya, Format="(E9.3, 1X)", Bytes_Per_Rec=20)

If you omit the Bytes_Per_Rec keyword, but still read the file as a column-oriented file, only the first pair of data values on each line would actually be transferred into the variables Xa and Ya. Nor can the file be read as row-oriented data, because Xa would be filled completely before any data was transferred to Ya.

TIP: Only include the Bytes_Per_Rec keyword when you have a logical record that is longer or shorter than one line in the file. For the majority of column-oriented data files, one and only one value from each variable is on a single line, and the Bytes_Per_Rec keyword is completely unnecessary.

Example 6

Assume that you have a file, chrono.dat, that contains some data values and also some chronological information about when those data values were recorded:

01/01/92 10:30:35 10.00 04-30-92 32767
02/01/92 23:22:15 15.89 06-15-91 99999
05/15/91 03:03:03 14.22 12-25-92 87654

The date/time templates that will be used to transfer this data have the following definitions:


Number	Template Description
1	MMDDYY (* = any delimiter)
-1	HHMMSS (* = any delimiter)

To read the date and time from the first two columns into one date/time variable and read the third column of floating point data into another variable, use the following commands:

date1 = REPLICATE({!DT},3)
date2 = REPLICATE({!DT},3)

; The system structure definition of date/time is !DT. Date/time 
; variables must be defined as !DT structure arrays before being 
; used if the date/time data is to be read as such.

status = DC_READ_FIXED("chrono.dat", date1, $ date1, decibels, 
Dt_Template=[1,-1],  $
    Format="(2(A8, 1X), F5.2)", /Column)

; The variable date1 is listed twice; this way, both the date data 
; and the time data can be stored in the same variable, date1.

To read all columns, change the call to DC_READ_FIXED and define a new variable:

calib = INTARR(3)
status = DC_READ_FIXED("chrono.dat", date1,  $
    date1, decibels, date2, calib, /Column,  $
    Format="%8s %8s %f %8s %d", Ignore=  $
    ["$BAD_DATE_TIME"], Dt_Template=[1,-1])

Notice how the date/time templates are reused. For each new record, Template 1 is used first to read the date data into date1. Next, Template -1 is used to read the time data into date1. Finally, since there is another date/time variable to be read (date2) and there are no more templates left, the template list is reset and Template 1 is used again. The template list is reset for each record.

NOTE: Because of the internal conversion that DC_READ_FIXED performs to convert the date strings to PV-WAVE's date/time internal structure, the date and time data must be read with the A8 (FORTRAN) or %8s (C) format string.

Normally an error would be reported if the input text to be read as date/time is invalid and cannot be converted. But because the Ignore=["$BAD_DATE_TIME"] keyword was provided, any record containing this type of error is ignored and no error is reported.

Example 7

The data file shown below is a fixed-formatted ASCII file named wages.wp. All floating-point data in the file has been decimal-point-aligned by a word-processing application:


1070.00	9007.97	1100.00	1250.00	850.50	2010.00
5000.00	3050.00	1044.12	3500.00	6031.00	905.00
415.00	5200.00	1300.10	350.00	745.00	3000.00
200.00	3100.00	8100.00	7050.00	6780.00	2310.25
950.00	1050.00	1350.00	410.00	797.00	200.36
2600.00	2000.00	1500.00	2000.00	1000.00	400.00
1000.00	9000.00	1100.00	2091.00	3440.10	2000.37
5000.00	3000.00	1000.01	3500.00	6000.00	900.12

The following commands:

Maria = Fltarr(12) & Naomi = Fltarr(12)
Klaus = Fltarr(12) & Carlos = Fltarr(12)
status = DC_READ_FIXED('wages.wp', Maria,  $
    Carlos, Klaus, Naomi, Format="(F7.2,5X)",  $
    Ignore=["$BLANK_LINES"])

read the data from file wages.wp and places the data into four variables: Maria, Carlos, Klaus, and Naomi. By default, row organization is assumed in the file, with five spaces separating the values in the file.

With row organization, each variable is "filled up" before any data is transferred to the next variable in the variable list. This means that the first two lines of the file are transferred into the variable Maria, the new two lines of the file are transferred into the variable Carlos, the next two lines of the file are transferred into the variable Klaus, and the last two lines of the file are transferred into the variable Naomi. The blank lines in the file are skipped entirely, preventing those lines from being interpreted as a series of zeroes.

DC_READ_FIXED Function

Usage

Input Parameters

Output Parameters

Returned Value

Keywords

Discussion

String Resources Used By This Function

How the Data is Transferred into Variables

Format Reversion when Reading Data

Physical Records vs. Logical Records

Changing the Logical Record Size

Filtering and Substitution While Reading Data

Missing Data Substitution

Reading Row-Oriented Files

Reading Column-Oriented Files

Multi-dimensional Variables

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6

Example 7

See Also