Wednesday, March 11, 2015

FLOWOVER/ MISSOVER/ TRUNCOVER/ SCANOVER/ STOPOVER


FLOWOVER/ MISSOVER/ TRUNCOVER/ SCANOVER/ STOPOVER 
The reason why I put these five "over" options together is because they all working for reading data with missing value.

FLOWOVER

The default of the input statement. SAS will jump to next line if the current line does not have enough variables to read. After jump to next line, SAS will read the line no matter it have enough variables or not. In most of the cases, the raw data already organized line by line. It should consider as missing date if the line do not have enough variables. Then we going to need the following options.

MISSOVER

It will set the variable as missing value from current variable to last variable in the variable list if there is not enough data or the length of variable is short in the current line.

TRUNCOVER

The difference between MISSOVER and TRUNCOVER is TRUNCOVER will keep the data it already read from a short line. This option will not skip information.

SCANOVER
Will skip to next line until it find @'character-string' specified in the INPUT. It it exact the same as FLOWOVER option. In another word, SCANOVER need combine with MISSOVER OR STOPOVER to use.

STOPOVER
Stops the DATA step when it reads a short line. Set variable _ERROR_ to 1 and print the data it already in the data set.

Let's see an example. Suppose there is a file 'c:\example.txt' contains the following data:

55555
1
22
333
4444
55555


Use the options we mentioned before to read the data.

DATA TEST;
INFILE 'C:\OVERTEST.TXT' (*)OVER;
INPUT TESTNUM 5.;
RUN;


PS:(*)over means the options we mentioned above. (FLOWOVER/MISSOVER/TRUNCOVER/SCANOVER/STOPOVER)


Result: 
OBS
FLOWOVER
MISSOVER
TRUNCOVER
STOPOVER
1
2
3
4
5
6
55555
22
4444
55555
55555
.
.
.
.
55555
55555
1
22
333
4444
55555
55555

FLOWOVERIt will skip the second line because '1' is short. Then SAS will read the third line instead which is '22'. Then skip '33' and read next line '44' instead.
MISSOVER Set the short line, line 2 to line 5, as missing value. 
TRUNCOVERKeep the information from line 2 to line 5.
STOPOVERStop at the line 2 because it is a short line. Print x=. _ERROR_=1 _N_=1 in LOG. Do nothing if the dataset test already exist. Or it will create detest test with just 1 data.


What will happen if use those options together?


FLOWOVER combine with MISSOVER or TRUNCOVER: FLOWOVER will be ignored.

SCANOVER combine with MISSOVER or TRUNCOVER: SCANOVER will have high priority when reading @'character-string' lineother lines will depend on MISSOVER or TRUNCOVER.


STOPOVER will have the highest priority

2 comments:

  1. Thank you so much for solving this in my head. FINALLY !

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete