[csapp] Lab1 Data Lab

Lab1 Data Lab: Manipulating Bits

1. Introduction

Instruction: http://csapp.cs.cmu.edu/3e/datalab.pdf

The purpose of this assignment is to become more familiar with bit-level representations of integers and floating point numbers. You’ll do this by solving a series of programming “puzzles.” Many of these puzzles are quite artificial, but you’ll find yourself thinking much more about bits in working your way through them.

本实验的实验环境Unix-like system,需要使用gcc编译32位程序。再ubuntu20版本中,也许需要安装32位、64位同时支持的gcc

2. Puzzles

2.1 bitXor

1
2
3
4
5
6
7
8
9
10
11
/*
* bitXor - x^y using only ~ and &
* Example: bitXor(4, 5) = 1
* Legal ops: ~ &
* Max ops: 14
* Rating: 1
*/
int bitXor(int x, int y) {
int ans = (~(x&y))&(~((~x)&(~y)));
return ans;
}

bitXor的情况比较简单,可以使用真值表,画出卡诺图的方法来化简。

2.2 Tmin

1
2
3
4
5
6
7
8
9
10
/*
* tmin - return minimum two's complement integer
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 4
* Rating: 1
*/
int tmin(void) {
int ans = 0x1 << 31;
return ans;
}

根据定义,Tmin是0x80000000,直接使用移位操作就行。

2.3 isTmax

1
2
3
4
5
6
7
8
9
10
11
/*
* isTmax - returns 1 if x is the maximum, two's complement number,
* and 0 otherwise
* Legal ops: ! ~ & ^ | +
* Max ops: 10
* Rating: 1
*/
int isTmax(int x) {
int ans = (!((x+1)^(~x)))&(!!((~x)^(0x0)));
return ans;
}

这个看起来比较困难,但是考虑一下规律 Tmax+1 = Tmin,~Tmax = Tmin,可以通过这两点来判断一个数是否是Tmax,但是,需要注意的是当x的值是0xFFFFFFFF时,也是符合这个规律的,后面的其他puzzles也有这个特例,需要注意。

注意这里!!的作用,当!!(非零)输出1,反之输出0

2.4 allOddBits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/*
* allOddBits - return 1 if all odd-numbered bits in word set to 1
* where bits are numbered from 0 (least significant) to 31 (most significant)
* Examples allOddBits(0xFFFFFFFD) = 0, allOddBits(0xAAAAAAAA) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 12
* Rating: 2
*/
int allOddBits(int x) {
int ans = 0;
int msk = 0xAA + (0xAA << 8);
msk = msk + (msk << 16);
ans = !((msk&x)^msk);
return ans;
}

这里使用odd-Mask来获得在奇数位上的mask,然后把这个mask应用到输入的数中。

2.5 negate

1
2
3
4
5
6
7
8
9
10
11
/*
* negate - return -x
* Example: negate(1) = -1.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 5
* Rating: 2
*/
int negate(int x) {
int ans = ~x + 1;
return ans;
}

最基本的补码操作。

2.6 isAsciiDigit

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/*
* isAsciiDigit - return 1 if 0x30 <= x <= 0x39 (ASCII codes for characters '0' to '9')
* Example: isAsciiDigit(0x35) = 1.
* isAsciiDigit(0x3a) = 0.
* isAsciiDigit(0x05) = 0.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 15
* Rating: 3
*/
int isAsciiDigit(int x) {
int sign = 0x1 << 31;
int upbound = ~(sign|0x39);
int lowbound = ~0x30+1;
int ans = !((sign&(upbound+x)>>31)|(sign&(lowbound+x)>>31));
return ans;
}

一个不是非常显然的trick,非常的巧妙,判断这个数是否在0x30和0x39之间的条件是,(0x30+1)+x应当大于0,而(Tmin+0x39) =-(Tmin+0x39)-1,这里-1的目的是因为Tamx=2^32-1, Tmin=-2^32,这里用到的性质就是判断-(Tmin+0x39)-1+x是否“溢出”到负数。

2.7 conditional

1
2
3
4
5
6
7
8
9
10
11
12
/*
* conditional - same as x ? y : z
* Example: conditional(2,4,5) = 4
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 16
* Rating: 3
*/
int conditional(int x, int y, int z) {
x = !!x;
x = ~x + 1;
return (x&y)|((~x)&z);
}

双目运算符实现,在 2.3 中已经说明了!!的用途,这里首先判断x是否是0,如果x是0,x的值应当是0x00000000,反之是0x00000001,所以为了下面并操作的方便对x取负数,则x是0的时候,x=0x00000000,x=0x11111111,接下来就直接或操作,输出一个真值就行。

2.8 isLessOrEqual

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/*
* isLessOrEqual - if x <= y then return 1, else return 0
* Example: isLessOrEqual(4,5) = 1.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 24
* Rating: 3
*/
int isLessOrEqual(int x, int y) {
int negX=~x+1;
int addX=negX+y;
int checkSign = addX>>31&1;
int leftBit = 1<<31;
int xLeft = x&leftBit;
int yLeft = y&leftBit;
int bitXor = xLeft ^ yLeft;
bitXor = (bitXor>>31)&1;
return ((!bitXor)&(!checkSign))|(bitXor&(xLeft>>31));
}

思考:为什么不能直接判断y-x的正负?

当y=0x7FFFFFFF且x=0x80000000时,显然y-x已经超过了32位能表示的范围,所以这样判断时不准确的。如果符号相同,用减法的话时不会出现这个情况的。

分类讨论:当符号相同的时候,看addX的符号位;当符号不同的时候直接输出。

2.9 logicalNeg

1
2
3
4
5
6
7
8
9
10
11
/*
* logicalNeg - implement the ! operator, using all of
* the legal operators except !
* Examples: logicalNeg(3) = 0, logicalNeg(0) = 1
* Legal ops: ~ & ^ | + << >>
* Max ops: 1
* Rating: 4
*/
int logicalNeg(int x) {
return ((x|(~x+1))>>31)+1;
}

非0为1,其余的情况为0。利用补码来实现,显然,只有0和Tmin的补码为本身,其余都是相反数。所以可以利用这个性质,将一个数(除了0和Tmin)与其补码做与操作,做与操作我们需要的结果实际上就是判断这个数和他的补码是不是在符号位上不一样,所以我们>>31用来取到这个数的符号位,如果符号位是0,代表这个数是0,否则是其他的数。

2.10 howManyBits

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/* howManyBits - return the minimum number of bits required to represent x in
* two's complement
* Examples: howManyBits(12) = 5
* howManyBits(298) = 10
* howManyBits(-5) = 4
* howManyBits(0) = 1
* howManyBits(-1) = 1
* howManyBits(0x80000000) = 32
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 90
* Rating: 4
*/
int howManyBits(int x) {
int b0,b1,b2,b4,b8,b16;
int sign = x >> 31;
x = (sign&~x)|(~sign&x);
b16 = !!(x>>16)<<4;
x = x >> b16;
b8 = !!(x>>8)<<3;
x = x >> b8;
b4 = !!(x>>4)<<2;
x = x >> b4;
b2 = !!(x>>2)<<1;
x = x >> b2;
b1 = !!(x>>1);
x = x >> b1;
b0 = x;
return b16+b8+b4+b2+b1+b0+1;
}

首先判断符号位子,把所有的负数全部都转换成正数。因为在正数中最大一位必定是1,而负数中最大的一位是0,不好判断。而对于Tmin,转换后还是Tmin,所以不会因为转换成正数,超出正数的表示范围而出问题。然后,使用折半的方法来查找。注意运算顺序,在b16 = !!(x>>16)<<4;中,先运算!!(x>>16)再做<<4操作,这里的意思是,先看高16位有没有1,有的话那么低16位肯定都需要用到,则b16=0x00000001<<4,那么再检查高16位中的高8位有没有1……,延续这些步骤直到最后一位。

2.11 floatScale2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//float
/*
* floatScale2 - Return bit-level equivalent of expression 2*f for
* floating point argument f.
* Both the argument and result are passed as unsigned int's, but
* they are to be interpreted as the bit-level representation of
* single-precision floating point values.
* When argument is NaN, return argument
* Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
* Max ops: 30
* Rating: 4
*/
unsigned floatScale2(unsigned uf) {
int exp = (uf&0x7f800000)>>23;
int sign = uf&(1<<31);
if(exp==0) return uf<<1|sign;
if(exp==255) return uf;
exp++;
if(exp==255) return 0x7f800000|sign;
return (exp<<23)|(uf&0x807fffff);
}

本函数要求使用 unsigned int 来表示float,并计算2float,首先,明确unsigned int 在 IA32 计算机中的位数,一共32位,那么sign占一位,exp占8位,M占23位。那么对于2f来说,只要修改exp就行了。但是存在特殊情况,当exp==0时,是非规格化的情况,直接M<<1就行;当exp==0xFF时,代表NaN,直接返回。再对exp++,就可以。在增加过后,还需要检查exp是否正确,当exp==0xFF,返回原符号的无穷大。否则,返回指数加一以后的数。

2.12 floatFloat2Int

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/*
* floatFloat2Int - Return bit-level equivalent of expression (int) f
* for floating point argument f.
* Argument is passed as unsigned int, but
* it is to be interpreted as the bit-level representation of a
* single-precision floating point value.
* Anything out of range (including NaN and infinity) should return
* 0x80000000u.
* Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
* Max ops: 30
* Rating: 4
*/
int floatFloat2Int(unsigned uf) {
int s_ = uf>>31;
int exp_ = ((uf&0x7f800000)>>23)-127;
int frac_ = (uf&0x007fffff)|0x00800000;
if(!(uf&0x7fffffff)) return 0;

if(exp_ > 31) return 0x80000000;
if(exp_ < 0) return 0;

if(exp_ > 23) frac_ <<= (exp_-23);
else frac_ >>= (23-exp_);

if(!((frac_>>31)^s_)) return frac_;
else if(frac_>>31) return 0x80000000;
else return ~frac_+1;
}

float to int. float中frac部分是大于等于1的,因为这样可以免费的拿到一个精度。如果真实的指数大于31,那么在转换为int时候,1<<31会覆盖掉符号位,所以exp > 31 就算是int溢出了,我们就直接返回溢出值0x80000000。如果exp < 0 那么就只有小数,直接舍去。接下来就是要把23位的小数部分全部转化为整数,然后判断溢出,再舍去小数部分。如果和原符号相同则直接返回,否则如果结果为负(原来为正)则溢出返回越界指定值0x80000000u,否则原来为负,结果为正(因为这里运算的时候是只计算了frac,而frac在1~2之间,没有考虑原来的符号位),则需要返回其补码(相反数)。

2.13 floatPower2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/*
* floatPower2 - Return bit-level equivalent of the expression 2.0^x
* (2.0 raised to the power x) for any 32-bit integer x.
*
* The unsigned value that is returned should have the identical bit
* representation as the single-precision floating-point number 2.0^x.
* If the result is too small to be represented as a denorm, return
* 0. If too large, return +INF.
*
* Legal ops: Any integer/unsigned operations incl. ||, &&. Also if, while
* Max ops: 30
* Rating: 4
*/
unsigned floatPower2(int x) {
int INF = 0xff<<23;
int exp = x + 127;
if(exp <= 0) return 0;
if(exp >= 255) return INF;
return exp << 23;
}

我们需要求的是 $2^{x}$,其sign=1,exp=1+127=128,frac=1.0-1=0,所以可以表示为 $2.0^{x}=(1.0\times 2)^{x}=1.0 \times 2^{x}$,所以,我们传进来的x就是可以当作exp(减去128后的)用的。但是任然需要判断exp的范围是否正确。exp<<23是把exp放到正确的float位置上。

3. Conclution

本实验在Ubuntu20.04 VM上运行。

最终结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
1. Running './dlc -z' to identify coding rules violations.

2. Compiling and running './btest -g' to determine correctness score.
gcc -O -Wall -m32 -lm -o btest bits.c btest.c decl.c tests.c
btest.c: In function ‘test_function’:
btest.c:332:23: warning: ‘arg_test_range[1]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
332 | if (arg_test_range[1] < 1)
| ~~~~~~~~~~~~~~^~~

3. Running './dlc -Z' to identify operator count violations.

4. Compiling and running './btest -g -r 2' to determine performance score.
gcc -O -Wall -m32 -lm -o btest bits.c btest.c decl.c tests.c
btest.c: In function ‘test_function’:
btest.c:332:23: warning: ‘arg_test_range[1]’ may be used uninitialized in this function [-Wmaybe-uninitialized]
332 | if (arg_test_range[1] < 1)
| ~~~~~~~~~~~~~~^~~

5. Running './dlc -e' to get operator count of each function.

Correctness Results Perf Results
Points Rating Errors Points Ops Puzzle
1 1 0 2 7 bitXor
1 1 0 2 1 tmin
1 1 0 2 9 isTmax
2 2 0 2 7 allOddBits
2 2 0 2 2 negate
3 3 0 2 13 isAsciiDigit
3 3 0 2 8 conditional
3 3 0 2 17 isLessOrEqual
4 4 0 2 5 logicalNeg
4 4 0 2 36 howManyBits
4 4 0 2 14 floatScale2
4 4 0 2 21 floatFloat2Int
4 4 0 2 5 floatPower2

本实验中有很多的trick挺难想的,我在做的过程中也在网上看了solution,虽然CMU的honor code里规定的挺严格的。这个实验的最终目的是为了从熟知到深知到理解信息的表示和处理,用unsigned int 来表示float确实理解更为深刻了。还需要看书巩固一下。

作者

WangCH

发布于

2022-01-30

更新于

2022-02-11

许可协议

评论