Solved: SAS data step same operation with different results

twedwards

Hi all,

I came across an interesting behavior in the SAS data step. Essentially, two identical calculations have different results.

 data indata;
  a = 3;

  m = 1/a;
  n = 1/a;
 
  x = m-n;
  y = m-n;
run;
 
proc print data=indata;
run;

The results are below.

As you can see, X and Y are different, despite both being set to the same operation of (m-n). I understand that there are issues with floating point values in programming where a subtraction of two identical fractions may not result in 0. My confusion is why x and y are not either both 0 or both the value of -1.8513E-17.

Cheers

Kathryn_SAS

I am in Technical Support. I am sorry I led you down the path of numeric precision when that isn't exactly the issue here. I am running the exact same version of 9.4M6 as you:

13 %put &sysvlong;
9.04.01M6P110718

and I am not replicating the same difference that you show.

While the behavior does seem strange and incorrect, the fact that we cannot replicate it in the same version and the fact that it does not appear to occur in later releases suggests that it would not be treated as a bug for M6. Ultimately, the way to account for the difference when using imprecise values is to use the ROUND function.

Since you are running on Windows 32-bit, the following SAS Note might provide some additional explanation:

6214 - Similar equations yield different results on Intel machines

If you would like to continue to pursue this through a case with Technical Support, please feel free to open a new one and include your complete log and output.

View solution in original post

Tom

Are you sure you printed the same dataset as made by that code?

1    data indata;
2      a = 3;
3
4      m = 1/a;
5      n = 1/a;
6
7      x = m-n;
8      y = m-n;
9    run;

NOTE: The data set WORK.INDATA has 1 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


10
11   proc print data=indata;
12   run;

NOTE: There were 1 observations read from the data set WORK.INDATA.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

twedwards

Yes. I just re-ran in a fresh session. I added two additional 'flags' to check equality.

 data indata;
  a = 3;

  m = 1/a;
  n = 1/a;
 
  x = m-n;
  y = m-n;

  if m=n then FLAG1="Y";
  if x=y then FLAG2="Y";
run;
 
proc print data=indata;
run;

What SAS version are you running? A coworker ran the above code on SAS EG as well, and observed similar results.

FoxHope

Hi Tom,

I reran the same code and got the result that the OP had.

However, if I insert the "put" command in between, suddenly both are 0 again. But outputting the values to log shouldn't change the calculation, should it?

Best,

Quentin

I can't replicate your results. I ran on 9.4M7 (Windows) and 9.4M8 (Linux).

The Boston Area SAS Users Group (BASUG) is hosting an in person Meeting & Training on June 27!
Full details and registration info at https://www.basug.org/events.

twedwards

Hi Quentin,

Thanks for testing on your machine! It appears I have a different version.

Cheers

FoxHope

Hi,

After running proc product_status, it seems that my version is 9.4_M6 with image version 9.04.01M6P110718

Tom

Pictures of code are not that helpful. The text itself is better. Same for the LOG. And preferably the OUTPUT also, but using ODS instead of plain old text output makes that difficult now.

Showing the LOG would help us be sure that you have actually created a NEW dataset. If the data step failed for some reason, such as you have the file open in another window, then the printout is of an something different. Perhaps in that different dataset X and Y were calculated differently.

Also it would be interesting to see if the pattern continues if you change the code.

Say by making more variables that should have the same result. Do they all get different values? Or are the two values you are getting now the only possible results?

Or perhaps just adding other statements that don't really do anything might make the data step compiler generate different code and cause it to work.

Kathryn_SAS

This is classic numeric precision in SAS. We recommend using the ROUND function.

data indata;
  a = 3;

  m = 1/a;
  n = 1/a;
 
  x = round(m-n,.01);
  y = round(m-n,.01);

  if m=n then FLAG1="Y";
  if x=y then FLAG2="Y";
run;
 
proc print data=indata;
run;

FoxHope

Hi,

Do you mind if I ask further clarification question? Because it seems that this goes beyond the precision issue. If 1/3 cannot be represented precisely in binary, shouldn't the value of c and d be BOTH off by the same amount and their difference should be zero regardless?

I tried rerun the code and it seems that simply outputting the results to the log using put (as seen on my post above) would change the result of r and s. This seems too unintuitive for me to be explain by the numeric precision issue.

Quentin

@FoxHope wrote:

Hi,

Do you mind if I ask further clarification question? Because it seems that this goes beyond the precision issue. If 1/3 cannot be represented precisely in binary, shouldn't the value of c and d be BOTH off by the same amount and their difference should be zero regardless?

Agree, what you are reporting doesn't make sense to me, and isn't the basic numeric precision issue.

Even if somehow these two statements could return different values (because of what, random noise in precision):

  m = 1/a;
  n = 1/a;

I can't think of an explanation for how these two statements could return different values (accept again random noise in precision...)

  x = m-n;
  y = m-n;

And yes, your addition of a PUT statement shouldn't change values.

Something odd is happening. Did you quit SAS and restart a fresh session?

The Boston Area SAS Users Group (BASUG) is hosting an in person Meeting & Training on June 27!
Full details and registration info at https://www.basug.org/events.

FoxHope

Hi Quentin,

Yes, I did restart my computer (just in case), restart the session, tried in both SAS 9.4 and SAS EG, I still have the same result that I posted earlier. I wonder if it's because I am running on a slightly older version of SAS (9.4_M6) since other people can't replicate what I see (except OP since we are having the same results).

Best

Quentin

You might try submitting to tech support. I can't think of an explanation. It's interesting that you both have the same version (9.4M6) showing this odd behavior. But I still have a hard time believing this is a weird bug.

The Boston Area SAS Users Group (BASUG) is hosting an in person Meeting & Training on June 27!
Full details and registration info at https://www.basug.org/events.

Kathryn_SAS

Here is another example that may help clarify the numeric precision issue. Note that both X and Y appear to be 1 but they are slightly different when viewed in hex.

132  data test;
133  x=1;
134  y=1/10;
135
136  do i=1 to 9;
137    x+1;
138    y + 1/10;
139    end;
140  x=x/10;
141
142  put x hex16.;
143  put y hex16.;
144  run;

3FF0000000000000
3FEFFFFFFFFFFFFF

Whenever the numeric variable is manipulated with an arithmetic function, the manipulation is done in binary. In other words, the floating point number is manipulated in binary and the result we see is the decimal form of the manipulation. SAS does not convert the number to decimal form, and then perform the manipulation in decimal.

Therefore, no matter what SAS step you use, PROC SQL, DATA Step or any other procedure, there is a chance that the binary mathematics will not result in the decimal mathematics.

The reason that SAS uses floating point representation is because this representation of numerics is what the operating system can easily manipulate. Using floating point allows the operating system to manipulate the bytes SAS stores without the need to convert those bytes to a binary form. This saves a significant amount of processing time. Additionally, using floating point representation, SAS can store numbers with larger magnitude and greater precision with less bytes than if we used a decimal representation.

Quentin

That makes sense as the usual numeric precision issue.

But I don't see how that could explain:

  a = 3;

  m = 1/a;
  n = 1/a;

Assigning different values to M and N. Can it?

The Boston Area SAS Users Group (BASUG) is hosting an in person Meeting & Training on June 27!
Full details and registration info at https://www.basug.org/events.

SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

Re: SAS data step same operation with different results

The 2025 SAS Hackathon Kicks Off on June 11!

SAS Training: Just a Click Away