本帖最后由 kami1217 于 2015-3-1 18:33 编辑
这帖子全手打的,肯定有错!
下图是某大学的地上用碎石铺的一个晶体管符号,作为这个帖子的开始。
CPU大家都知道就是个极其复杂的IC,里面有超多超多的transistors(晶体管,不知道有没有人玩过一个游戏也叫晶体管,挺有意思的),拿我的i7-3930k(几年的古董了) 来说吧,它就有22.7亿个晶体管。晶体管吧,常用的有两种,一种BJT(bipolar junction transistor),这个base电流不为0所以会浪费能源。另一种叫MOSFET(俗称mos),这个好,gate电流为0,所以目前市场90%以上的transistor都是用mos。
下面就是一个NPN(BJT),做个speaker的amplifier还行,但绝对不会出现在CPU里面,即便base电流很小,如果乘以个1亿,P=VI是谁说的呢?
=> =>
MOSFET一般分两种,一种Nmos,一种Pmos。把元素周期表拿出来,硅前面那一串元素做的就是p-type(需要一个电子才成为半导体,一般用Boron硼,果然是B都有hole吗?),反之后面那一串做的就是n-type(丢一个电子成为半导体,一般用phosphorus磷,P长得就像手枪)。
当我们稍稍给它两一点压力(potential difference),N就开始激动了,想把一个神奇的物体射到P的小穴里面,不射会非常不爽,我们称这个现象叫做x where x = pn-junction or 爱。根据不同的材料制造的这种动作可能会有不同的效果,比如把电压控制到0.7的普通diode,or由于剧烈到触发强大高能,估计是太爽了,结果造出了可以发光的小孩叫photon, diode也升级成为-LED(light emitting diode)。哦,对了,还记得爱因斯坦的光电效应么,同理。另外,几个日本大叔发现了Blue LEDs (InGaN LEDs),因此获得了2014诺贝额物理学奖,因为该发现可以将制造白光所需的能源减少最多至90%。日本大叔就是厉害啊,真正改变了世界。
回到CPU
当一个PMOS和一个NMOS组成一起的时候就会形成一个CMOS,那这个CMOS用来干啥呢? 答案是用来当开关的,N开,P就关;P开,N就关。
CMOS切面图——
对,CPU用的就它了,虽然非常小,而且很多层,但用显微镜看就是这个样子滴。
比如这个:
就是常见的AND-gate,两个NMOS串联,两个PMOS并联,所以output是1的情况为:A和B都得是1。
好了,我们为什么要CPU呢?
因为我们需要它来做很多简单重复的逻辑工作。
为什么要集成上亿的晶体管在里面呢?
因为我们需要更好的performance,晶体管越多,可实现的逻辑门越多,可用的function/clock就能越复杂。
为什么大家喜欢超频呢?
因为当你不能改变CPU结构的情况下,你只能增加频率来提升速度(T=1/f),同样1分钟,你可以跑更多的clock cycle。
为什么台积电,inte天天在炫耀他们工艺技术多先进? 又是几纳米了,又是合格率了...
其实就是把gate-Length变小,这样W/L ratio就会变大,增加performance。
附两个公式一个图:
线性区:
饱和区:
scaling的好处是什么呢?
一个wafer就那么大,die越小,变相缩小成本;如果die不变,变相提升performance因为可集成的transistor又多了;还有一个好处就是节约能源了,原因有点复杂还是和W/L有关。
可以一直变小么?
难,相当难,所以多核心出现了!即便如此,摩尔定律的slope也趋缓了,有垂死挣扎的感觉。
开始正题吧在之前给大家看看CPU是什么做的:
对,你没看错,就是含高硅的黄沙做的,所以几千几千的CPU的主材料就是这个玩意儿。
然后加工提纯后,变成这个,感觉有点像那个啥。
不说了,打字太累...直接上图:
这个是初步的设计图,没想太多,主要是抛砖引玉,以后可以加更多的block
这个CPU是16bits的,所有bus均为16bits,一共8个储存register(注册器),两个临时注册器,一个指令注册器,一个multiplexer(选择器),一个主控说白了就是个FSM(有限状态机), 之后加了一个counter,数指令clock的,比如movi为一个clock。
Specification(直接从我的报告复制的)
• Register – This system has three inputs, two 1-bit signals, Clock and Enable, and one 16-bit signal, D. Also, this system has one 16-bit output signal, Q.
• Instruction Register – This system has three inputs, two 1-bit signals, Clock and Enable, and one 16-bit signal, Din. Also, this system has one 9-bit output signal, cmd.
• Multiplexer – This system has eleven input signals, one 4-bit Sel signal and ten 16-bit signals, Reg0, Reg1, Reg2, Reg3, Reg4, Reg5, Reg6, Reg7, Din, and AddSub. This system also has one 16-bit output signal, Bus.
• Adder/Subtracter – This system has three input signals, one 1-bit signal Sign, and two 16-bit signals, Rx and Ry. This system has one 16-bit output signal, Output.
• Control Unit – This system has four input signals, three 1-bit signals, Run, Reset, and Clock, and one 9-bit signal IRin. Also, this system has thirteen output signals, one 4-bit Mux signal, and twelve 1-bit signals, Reg0, Reg1, Reg2, Reg3, Reg4, Reg5, Reg6, Reg7, RegA, Done, IRen, and AddSub.
• CPU – This system has four input signals, three 1-bit signals, Run, Reset, and Clock, and one 16-bit signal, Din. Also, this system has four output signals, one 1-bit signal, Done, and three 16-bit signals, BusO, R0out, and R1out.
对了还有8个七划灯(seven-segment-display,就是你电梯上看到的那个数楼层的),作为每个register的显示,貌似忘写了。
模拟测试:
储存注册器——
指令注册器——
对了这个忘说了,指令为9位,IIIXXXYYY, III为指令代号,XXX为Rx的代号,YYY为Ry的代号。
目前的指令就四个,mov;movi,add,sub。代号分别为001,010,011,100.由于代号为三位,所以opcode最多也就8个,加上000特殊,所以就7个。
给大家出个quiz吧:大家知道MIPS的opcode为多少位么?x86的呢?
选择器——
加法减法识别——
mv/mvi——
add/sub——
一切没问题之后就可以整理组合了。用的是Quartus II 11做的图,DE2的板作为测试。
图1:
图2:
这是最终测试结果,貌似还好...
最后把vhdl的code贴上来,感觉可能有点长,哎~
Reg_16.vhd Library ieee; Use ieee.std_logic_1164.all; Entity Reg16 is Port( Clock : in std_logic; Enable : in std_logic := '1'; D : in std_logic_vector(15 downto 0) := "0000010000000000"; Q : out std_logic_vector(15 downto 0) := "0000000000000000" ); End Reg16; Architecture struct of Reg16 is Begin Process (Clock) Begin if(rising_edge(Clock) and Enable = '1') then Q <= D; end if; End Process; End struct; 复制代码
IReg.vhd Library ieee; Use ieee.std_logic_1164.all; Entity IReg is Port( Clock : in std_logic; Enable : in std_logic := '1'; Din : in std_logic_vector(15 downto 0) := "0000000000000000"; cmd : out std_logic_vector(8 downto 0) := "000000000" ); End IReg; Architecture struct of IReg is Begin Process (Clock, Enable, Din) Begin if(rising_edge(Clock) and Enable = '1') then cmd <= Din(15 downto 7); end if; End Process; End struct; 复制代码
Multiplexer.vhd Library ieee; Use ieee.std_logic_1164.all; Entity Multiplexer is Port( Sel : in std_logic_vector(3 downto 0); Reg0 : in std_logic_vector(15 downto 0); Reg1 : in std_logic_vector(15 downto 0); Reg2 : in std_logic_vector(15 downto 0); Reg3 : in std_logic_vector(15 downto 0); Reg4 : in std_logic_vector(15 downto 0); Reg5 : in std_logic_vector(15 downto 0); Reg6 : in std_logic_vector(15 downto 0); Reg7 : in std_logic_vector(15 downto 0); Din : in std_logic_vector(15 downto 0); AddSub : in std_logic_vector(15 downto 0); BusO : out std_logic_vector(15 downto 0) ); End Multiplexer; Architecture struct of Multiplexer is Begin Process (Sel, Reg0, Reg1, Reg2, Reg3, Reg4, Reg5, Reg6, Reg7, Din, AddSub) Begin if(Sel = "0000") then BusO <= Reg0; elsif(Sel = "0001") then BusO <= Reg1; elsif(Sel = "0010") then BusO <= Reg2; elsif(Sel = "0011") then BusO <= Reg3; elsif(Sel = "0100") then BusO <= Reg4; elsif(Sel = "0101") then BusO <= Reg5; elsif(Sel = "0110") then BusO <= Reg6; elsif(Sel = "0111") then BusO <= Reg7; elsif(Sel = "1000") then BusO <= Din; elsif(Sel = "1001") then BusO <= AddSub; end if; End Process; End struct; 复制代码
Adder.vhd Library ieee; Use ieee.std_logic_1164.all; Use ieee.numeric_std.all; Entity Adder is Port( sign : in std_logic := '0'; Rx : in std_logic_vector(15 downto 0) := "0000000000000000"; Ry : in std_logic_vector(15 downto 0) := "0000000000000000"; Output : out std_logic_vector(15 downto 0) := "0000000000000000" ); End Adder; Architecture struct of Adder is Begin Process (sign, Rx, Ry) Begin if(sign = '0') then Output <= std_logic_vector(unsigned(Rx) + unsigned(Ry)); else Output <= std_logic_vector(unsigned(Rx) - unsigned(Ry)); end if; End Process; End struct; 复制代码
Control_Unit.vhd (这个应该算是最关键的一个) Library ieee; Use ieee.std_logic_1164.all; Use ieee.numeric_std.all; Entity Control_Unit is Port ( Run : in std_logic := '1'; Reset: in std_logic := '0'; Clock: in std_logic; IRin : in std_logic_vector(8 downto 0) := "000000000"; Done : out std_logic := '0'; Clear: out std_logic := '0'; IRen : out std_logic := '1'; Mux : out std_logic_vector(3 downto 0) := "0000"; Reg0 : out std_logic := '0'; Reg1 : out std_logic := '0'; Reg2 : out std_logic := '0'; Reg3 : out std_logic := '0'; Reg4 : out std_logic := '0'; Reg5 : out std_logic := '0'; Reg6 : out std_logic := '0'; Reg7 : out std_logic := '0'; RegA : out std_logic := '0'; RegG : out std_logic := '0'; AddSub: out std_logic := '0' ); End Control_Unit; Architecture struct of Control_Unit is Type state_type is (decode, mv, add1, add2); Signal currS : state_type; Signal nextS : state_type; Signal toog : std_logic := '0'; Begin Process (Clock, Reset) Begin if(Reset = '1') then currS <= decode; elsif(rising_edge(Clock)) then currS <= nextS; toog <= not toog; end if; End Process; -- mv has one next state, so it has two clock cycles; -- mvi has no next state, so it has one clock cycle; -- add or sub has two next state, so it has three clock cycles; Process (toog) Begin if(Clock = '1') then Case currS is when decode => Reg0 <= '0'; Reg1 <= '0'; Reg2 <= '0'; Reg3 <= '0'; Reg4 <= '0'; Reg5 <= '0'; Reg6 <= '0'; Reg7 <= '0'; if(IRin(8 downto 6) = "001") then -- mv Rx, Ry Mux <= "0" & IRin(2 downto 0); -- Ry Case IRin(5 downto 3) is when "000" => Reg0 <= '1'; when "001" => Reg1 <= '1'; when "010" => Reg2 <= '1'; when "011" => Reg3 <= '1'; when "100" => Reg4 <= '1'; when "101" => Reg5 <= '1'; when "110" => Reg6 <= '1'; when "111" => Reg7 <= '1'; when others => null; End Case; nextS <= mv; Done <= '0'; Clear <= '0'; -- Keep counting IRen <= '0'; -- Hold the current command elsif(IRin(8 downto 6) = "010") then -- mvi Rx, #D Mux <= "1000"; -- Din Case IRin(5 downto 3) is when "000" => Reg0 <= '1'; when "001" => Reg1 <= '1'; when "010" => Reg2 <= '1'; when "011" => Reg3 <= '1'; when "100" => Reg4 <= '1'; when "101" => Reg5 <= '1'; when "110" => Reg6 <= '1'; when "111" => Reg7 <= '1'; when others => null; End Case; nextS <= mv; Done <= '0'; Clear <= '0'; -- Keep counting IRen <= '0'; -- Hold the current command elsif(IRin(8 downto 6) = "011" or IRin(8 downto 6)="100") then -- add or sub Rx, Ry Mux <= "0" & IRin(5 downto 3); -- Rx RegA <= '1'; -- Store Rx in A on next cycle RegG <= '1'; nextS <= add1; Done <= '0'; Clear <= '0'; -- Keep counting IRen <= '0'; -- Hold the current command end if; when mv => nextS <= decode; Reg0 <= '0'; Reg1 <= '0'; Reg2 <= '0'; Reg3 <= '0'; Reg4 <= '0'; Reg5 <= '0'; Reg6 <= '0'; Reg7 <= '0'; IRen <= '1'; -- Get new Command next clock cyle Done <= '1'; Clear <= '1'; -- Clear the Counter when add1 => RegA <= '0'; -- Disable A from writing Mux <= "0" & IRin(2 downto 0); -- Put Ry on the bus next cycle AddSub <= IRin(8); -- Determin add or sub nextS <= add2; when add2 => Case IRin(5 downto 3) is when "000" => Reg0 <= '1'; when "001" => Reg1 <= '1'; when "010" => Reg2 <= '1'; when "011" => Reg3 <= '1'; when "100" => Reg4 <= '1'; when "101" => Reg5 <= '1'; when "110" => Reg6 <= '1'; when "111" => Reg7 <= '1'; when others => null; End Case; RegG <= '0'; Mux <= "1001"; -- Put Result from Adder on the BUS nextS <= decode; -- Reset all reg Clear <= '1'; -- Clear the Counter Done <= '1'; IRen <= '1'; when others => null; End Case; end if; End Process; End struct; 复制代码
dec_7seg.vhd LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.all; USE IEEE.STD_LOGIC_ARITH.all; USE IEEE.STD_LOGIC_UNSIGNED.all; -- Hexadecimal to 7 Segment Decoder for LED Display ENTITY dec_7seg IS PORT( hex_digit_16 : IN STD_LOGIC_VECTOR(15 DOWNTO 0); segment_a, segment_b, segment_c, segment_d, segment_e, segment_f, segment_g : OUT std_logic); END dec_7seg; ARCHITECTURE a OF dec_7seg IS SIGNAL segment_data : STD_LOGIC_VECTOR(6 DOWNTO 0); SIGNAL hex_digit_4 : STD_LOGIC_VECTOR(3 DOWNTO 0); BEGIN PROCESS (Hex_digit_16, Hex_digit_4) -- HEX to 7 Segment Decoder for LED Display BEGIN -- Hex-digit is the four bit binary value to display in hexadecimal Hex_digit_4 <= Hex_digit_16(3 DOWNTO 0); CASE Hex_digit_4 IS WHEN x"0" => segment_data <= "1111110"; WHEN x"1" => segment_data <= "0110000"; WHEN x"2" => segment_data <= "1101101"; WHEN x"3" => segment_data <= "1111001"; WHEN x"4" => segment_data <= "0110011"; WHEN x"5" => segment_data <= "1011011"; WHEN x"6" => segment_data <= "1011111"; WHEN x"7" => segment_data <= "1110000"; WHEN x"8" => segment_data <= "1111111"; WHEN x"9" => segment_data <= "1111011"; WHEN x"A" => segment_data <= "1110111"; WHEN x"B" => segment_data <= "0011111"; WHEN x"C" => segment_data <= "1001110"; WHEN x"D" => segment_data <= "0111101"; WHEN x"E" => segment_data <= "1001111"; WHEN x"F" => segment_data <= "1000111"; WHEN OTHERS => segment_data <= "0111110"; -- if u get something alse, this is going to show "U" END CASE; END PROCESS; -- extract segment data bits and invert -- LED driver circuit is inverted segment_a <= NOT segment_data(6); segment_b <= NOT segment_data(5); segment_c <= NOT segment_data(4); segment_d <= NOT segment_data(3); segment_e <= NOT segment_data(2); segment_f <= NOT segment_data(1); segment_g <= NOT segment_data(0); END a; 复制代码
最后一个,总体测试
Test_CPU.vhd Library ieee; Use ieee.std_logic_1164.all; Entity Test_CPU is End Test_CPU; Architecture struct of Test_CPU is Component CPU is Port( Run : in std_logic := '1'; Reset : in std_logic := '1'; Clock : in std_logic; Din : in std_logic_vector(15 downto 0); BusO : out std_logic_vector(15 downto 0); Done : out std_logic; R0out : out std_logic_vector(15 downto 0); R1out : out std_logic_vector(15 downto 0) ); End Component; Signal Clk : std_logic := '0'; Signal Din : std_logic_vector(15 downto 0); Signal MSB : std_logic_vector(15 downto 0); Signal Done: std_logic; Signal r0o : std_logic_vector(15 downto 0); Signal r1o : std_logic_vector(15 downto 0); Begin Sys: CPU Port Map( Run => '1', Reset => '1', Clock => Clk, Din => Din, BusO => MSB, Done => Done, R0out => r0o, R1out => r1o ); Process Begin wait for 50 ns; Din <= "0100000000000000"; -- movi R0, 128 wait for 5 ns; Clk <= not Clk; -- High wait for 5 ns; Clk <= not Clk; --Low wait for 50 ns; Din <= "0000000010000000"; -- 128 wait for 5 ns; Clk <= not Clk; -- High wait for 5 ns; Clk <= not Clk; --Low wait for 50 ns; Din <= "0010010000000000"; -- mov R1 , R0 wait for 5 ns; wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 50 ns; Din <= "0110000010000000"; -- add R0 , R1 wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 50 ns; Din <= "0110000010000000"; -- add R0 , R1 wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 50 ns; Din <= "1000000010000000"; -- sub R0 , R1 wait for 5 ns; Clk <= not Clk; -- High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low wait for 5 ns; Clk <= not Clk; --High wait for 5 ns; Clk <= not Clk; --Low End Process; End struct; 复制代码
既然讲到了几位日本大叔,那就得把图补上来,就是那三位,相当NB。
还有改变世界的蓝光LED。