Network on a chip

From Wikipedia, the free encyclopedia
  (Redirected from Network On Chip)
Jump to: navigation, search

Network on chip or network on a chip (NoC or NOC) is a communication subsystem on an integrated circuit (commonly called a "chip"), typically between intellectual property (IP) cores in a system on a chip (SoC). NoCs can span synchronous and asynchronous clock domains or use unclocked asynchronous logic. NoC technology applies networking theory and methods to on-chip communication and brings notable improvements over conventional bus and crossbar interconnections. NoC improves the scalability of SoCs, and the power efficiency of complex SoCs compared to other designs.

Parallelism and scalability[edit]

The wires in the links of the NoC are shared by many signals. A high level of parallelism is achieved, because all links in the NoC can operate simultaneously on different data packets. Therefore, as the complexity of integrated systems keeps growing, a NoC provides enhanced performance (such as throughput) and scalability in comparison with previous communication architectures (e.g., dedicated point-to-point signal wires, shared buses, or segmented buses with bridges). Of course, the algorithms must be designed in such a way that they offer large parallelism and can hence utilize the potential of NoC.

Benefits of adopting NoCs[edit]

Traditionally, ICs have been designed with dedicated point-to-point connections, with one wire dedicated to each signal. For large designs, in particular, this has several limitations from a physical design viewpoint. The wires occupy much of the area of the chip, and in nanometer CMOS technology, interconnects dominate both performance and dynamic power dissipation, as signal propagation in wires across the chip requires multiple clock cycles. (See Rent's rule for a discussion of wiring requirements for point-to-point connections).

Research on on-chip networks[edit]

Although NoCs can borrow concepts and techniques from the well-established domain of computer networking, it is impractical to blindly reuse features of "classical" computer networks and symmetric multiprocessors[citation needed]. In particular, NoC switches should be small, energy-efficient, and fast[citation needed]. Neglecting these aspects along with proper, quantitative comparison was typical for early NoC research but nowadays they are considered in more detail[citation needed]. The routing algorithms should be implemented by simple logic, and the number of data buffers should be minimal[citation needed]. Network topology and execution properties may be application-specific on MPSoCs[citation needed].

Some researchers[who?] think that NoCs need to support quality of service (QoS), namely achieve the various requirements in terms of throughput, end-to-end delays, fairness,[1] and deadlines[citation needed]. Real-time computation, including audio and video playback, is one reason for providing QoS support. However, current system implementations like VxWorks, RTLinux or QNX are able to achieve sub-millisecond real-time computing without special hardware[citation needed]. This may indicate that for many real-time applications the service quality of existing on-chip interconnect infrastructure is sufficient, and dedicated hardware logic would be necessary to achieve microsecond precision, a degree that is rarely needed in practice for end users (sound or video jitter need only tenth of milliseconds latency guarantee). Another motivation for NoC-level quality-of-service is to support multiple concurrent users sharing resources of a single chip multiprocessor in a public cloud computing infrastructure. In such instances, hardware QOS logic enables the service provider to make contractual guarantees on the level of service that a user receives, a feature that may be deemed desirable by some corporate or government clients[citation needed].

To date,[when?] several prototype NoCs have been designed and analyzed in academia, but only few have been implemented on silicon[citation needed]. However, many challenging research problems remain to be solved at all levels, from the physical link level through the network level, and all the way up to the system architecture and application software. The first dedicated research symposium on networks on chip was held at Princeton University, in May 2007.[2] The second IEEE International Symposium on Networks-on-Chip was held in April 2008 at Newcastle University.

Research has been done on integrated optical waveguides and devices comprising an optical network on a chip (ONoC).[3][4]

NoC benchmarks[edit]

NoC development and studies require comparing different proposals and options. And NoC traffic patterns are developed to help such evaluations. Existing NoC benchmarks include NoCBench and MCSL NoC Traffic Patterns.[5]

Commercial providers on NoC solutions[edit]

See also[edit]

Chapter 1 INTRODUCTION 1.1 Overview A billion transistors, one million gates, thousands of circuits, hundreds of designs on a single IC chip; such intricate designs pretence incalculable challenges to IC designers. The mainly triumphant IC designers defeat all such challenges to provide functionally correct and consistent operation of the IC’s. As the integration increases the cost efficiency is also a major area of anxiety in IC design. Reduced cost is one of the large attractions of integrated electronics, the cost benefit continues to increase with the development of technology in the direction of the production of larger and larger circuit functions on a single semiconductor substrate; proposed by Gordon E Moore in his paper cramming more components onto integrated circuits, 1965, the paper in which he proposed the well-known Moore’s law. By 2025 the physical dimension of transistors are expected to cross the 10 nm threshold, according to a 2012 report of International Technology Roadmap for Semiconductors. The graph below figure 1.1 indicates the transistor integration on a single chip more than the past two decades. Figure 1.1 Growth of transistor integration on a chip It is to maintain speed by means of complex levels of integration so as to the design engineers have come up with a new design methodology called SoC. The SoC is a technology where maximum technology is crammed into the smallest probable gap. The intend of SoC is impacted powerfully by the so called intellectual property Memory Memory Memory (IP) core. An integrated circuit core is a predesigned, preverified silicon circuit building block. The core typically contains as a minimum 5,000 gates that can be used in construction a larger or additional complex application on a semiconductor chip. The IP cores are the building blocks of various systems on chip designs for implementing larger and complex embedded system applications. The SoC can grasp hardware’s like processors, memories and various tradition logic blocks and software’s for controlling the hardware’s. The main advantage of SoC is low power utilization, lower cost and higher consistency than the multi-chip systems it has replaced. But the transition to system on chip technology was faced with a lot of challenges. Firstly, scalability of the system. It is really an enormous task to scale down large computer systems to the size of silicon die. The physical dimensions of various components, their inductive and capacitive effects on previous components need to be taken care. Secondly, it is difficult to maintain global synchronization as different systems will be using different clock signals. Thirdly, the heterogeneity of the whole package where dissimilar systems with dissimilar library files, packages, coding languages and dimensions need to be packaged together. Fourthly, the issues in interconnection of the systems. Cores do not make up system on chips (SoC) unaccompanied. The figure 1.2 shows communication systems in SoC. a) Bus b) Point to Point c) NoC Figure 1.2 Communication systems in SoC a) Traditional bus based Communication, b) Point to point links c) NoC Typically, the interconnection structural design is based on devoted wires or collective busses. Consequently, devoted wires contain reduced reusability and flexibility. A collective bus is a set of wires which is common to manifold cores. The μp DSP RF Keypad μp DSP RF Keypad μp DSP RF Keypad approach of shared bus is more flexible and is completely reusable, but it allows only one communication operation at a time, all cores share the similar communication bandwidth in the system and its scalability is limited to few dozen IP cores. Thus scalability is a major problem with buses. Network on chip architecture has been proposed as a high performance, scalable and power efficient alternative to the bus based architecture. It solves the scalability problem by supporting multiple concurrent connections with various systems. As system becomes additional multifaceted, additional and additional integration is probable to the existing system with easiness with no any constraints. It can decrease the wire routing overcrowding to a great extent. The systems that are interconnected with a network on chip can be easily interchanged with other systems with any IP cores of any vendor available in the market. The NoC separates the communication part from the computation part for system simplicity and is ideally suited for integrated systems. NoC can take care of the communication part with utmost ease without any interference in the computation part. Nowadays, a lot of ICs contain a number of memories, processor cores, hardware cores and analog mechanism integrated on the similar chip. Such Systems on Chips are extensively used in high-end and high quantity applications, ranging from multimedia, wireless and wired communication systems to defence and aerospace applications. Because the number of cores made-up on a SoC increases by means of technology scaling, the 2D chip manufacture technology is facing lot of challenges in utilizing the rising exponentially the number of transistors. By means of smaller feature sizes, the performances of the transistors have increased radically. One more main crash of increased lengths and RC standards is that the power consumption of global interconnects turn out to be important, thus affectation a large challenge for system designers. The 3D IC refers to a stack consists of manifold ultra-thin layers of IC so as to be perpendicularly bonded in addition to interconnected by means of TSV as shown in Figure 1.3. In 3D functioning, every block can be made-up and optimized using their own technologies and assembled shape a vertical stack. 3D stacking of ultra-thin ICs is recognized as a predictable answer for prospect performance improvement, system smallness, and functional diversification. Figure 1.3 Representation of 3D IC 1.2 Objective To design an efficient and reliable 10 port router for 3D-Network on Chip (NoC), this is an important part of the entire Network on Chip (NoC). This 10 port router is designed using crossbar switch and arbiter. The main aim of this project is to reduce the delay. 1.3 Motivation Network on chip (NoC) is an active research field. Many aspects of NoC still require more exploration in addition to understanding. In this project the major concentration is to design an efficient router for NoC applications. The router is the main part that determines the latency, throughput, reliability and efficiency of the entire NoC design. 1.4 Problem Statement In order to provide communication between multiple processors a network on chip router with greater addressing capability has to be designed. From the literature survey conclude that routing is limited in previous work. In the previous work 5 port and 8 port routers were designed. But in the 8 port network we have the ability to connect a network of 8 systems which is limited. The previous routers were designed using buffered oriented but it consumes more power. In this proposed “Design and Verification of 10 port router for 3D-NoC architecture” buffers are reduced by using crossbar switch and arbiter. 1.5 Solution to the problem As the density of VLSI system increases, the complexity of the VLSI system will also increases. When complexity increases we have to face many challenges one of them is on chip interconnection. To overcome this on chip interconnection we need efficient router. In this project designed a 10 port router, which is the advancement for the previous 8 port router network. This proposed 10 port router is designed using crossbar switch and arbiter so it reduces the buffers and power consumption will be less. 1.6 Organization of the report Project report is documented in 9 chapters, the brief overview of the all chapters are as follows: Chapter 1: “Introduction” this chapter provides the information about overview of Network on chip (NoC), objective, motivation of this project, problem statement, and solution to the problem. Chapter 2: “Literature survey” this chapter provides the information about the previous works and also about the papers and journals that has been referred during the course of the project. Chapter 3: “Network on Chip (NoC)” this chapter provides the information about the 2D Network on chip, 3D Network on Chip and router. Chapter 4: “Proposed 10 Port Router Architecture” this chapter deals with the proposed system designs using crossbar switch and arbiter. Chapter 5: “Simulation Results” in this chapter the simulation is done by ModelSim 6.3f. The RTL schematic of 10 port router, technology schematic of 10 port router and top module of 10 port router also given. Chapter 6: “Synthesis, Power Analysis and Coverage Results” in this chapter the Synthesis, Power Analysis and Coverage Results are discussed. Chapter 7: “Advantages” in this chapter advantage of Network on Chip (NoC) router are listed. Chapter 8: “Conclusion & Future work” this chapter provides the information about conclusion of this project and future work. References: Gives the list of references used for the completion of the project. This includes various IEEE papers. Chapter 2 LITERATURE SURVEY A good number of papers were referred during the course of the project work. Some of them are listed below. Khalid Latif et.al, authors say that the NoC is the interconnection phase so as to answers the needs of the modern on-Chip design. This author says that when a buffer increases the power consumption and area will also increases. This author approached that by utilizing the inactive buffers the router architecture is optimized as an alternative of rising the number and size of buffers for better efficiency [1]. Brett Stanley Feero et.al, authors say that the NoC becoming as an innovatory to integrating large number of IP in a single die. This author presented the performance of 3D NoC and demonstrates their good functionality in terms of efficiency, energy [2]. Aamir Zia et.al, authors say that in order to place large number of processing components forming multi-core chip processors, there is a need for easily scalable, high-performance architecture. This author proposed work has 3D CLOS NOC to accomplish these goals [3]. Terrence Mak et.al, the author says that dynamic routing is capable because of its development in communication bandwidth. This author presented a deadlock-free routing to employs a DP network to provide on-the-fly best possible path development and monitoring of network for packet switching [4]. Kun-Chih Chen et.al, the authors say that 3D NoC projected to solve the multifaceted on-chip communication issues in prospect 3D multicore systems. Though, the thermal difficulties of 3D NoC are more serious than 2D NoC owing to chip stacking [5]. En-Jui Chang et.al, the authors say that Network-on-chip systems can have high performance compared to bus systems for chip multiprocessor systems. As the complexity of the network increases, the switch and channel congestion problems become major performance bottlenecks. This author used conventional adaptive routing scheme to detect the congestion status. This author remodelled the path congestion information to show hidden spatial congestion information [6]. Chapter 3 NETWORK ON CHIP (NoC) NoC is a technology that is intended to solve the short coming of buses. It is a method to design the communication subsystem between IP cores in a SoC design. NoC is proposed to solve the shortcomings of these, by implementing a communication network of switches/micro routers and resources. Most of the famous 2D NoC is the 2D Mesh as shown in Figure 3.1(a) this structural design consists of an m × n mesh of switches interconnecting IP blocks. Figure 3.1(b) shows an example of 3D Mesh NoC. It consists of 7-port switches: one is connected to the IP, one each is connected to the switches above and below, and one more is connected to each cardinal direction. Router Figure 3.1(a) 2D Mesh NoC architecture Figure 3.1(b) 3D Mesh NoC architecture 3.1 Router A router is the most significant component in a NoC. So it should be designed for maximum efficiency and throughput. A router is used in a network for directing the traffic from source to destination. A router typical routing node is shown in figure 3.2. West Output East Input West Input East Output PE Input PE Output Figure 3.2 A typical routing node The architecture of a router consists of an input port, an output port, a switching medium. Routers receive incoming data packets, examine their destination and figure out the most excellent path for the data to move from source to destination. A router’s architecture determines its significant path delay which affects per hop delay and network latency. Consequently the design of the router should be such that it meets the required latency and throughput requirements amidst stretched area and power constraints. The design efficiency of the router determines the performance of the network. Processing Element Router Routing Logic North Input North Output South Output South Input Chapter 4 PROPOSED 10 PORT ROUTER ARCHITECTURE The proposed 10 port router architecture is shown in below figure 4.1. The router architecture consists of FIFO, crossbar switch and arbiter. This architecture has less number of FIFO buffers so it takes very less area. Figure 4.1 Proposed 10 port Router architecture Crossbar Switch Port 1 FIFO Control Logic Port 2 FIFO Control Logic Port 3 FIFO Control Logic Port 10 FIFO Control Logic Port 1 Port 2 Port 3 Port 10 Arbiter 4.1 FIFO Buffer In this architecture the packets in transit are stored in a buffer. Each input channel has FIFO buffer and control logic. The FIFO controller receives the packet from the output port of the adjacent router. The complete transmission of data occurs when the buffer of that channel is not full. The output of the FIFO buffer is given to the crossbar switch that switches the data to the corresponding output port. The top module of FIFO is shown in figure 4.2. It has data width 8 bits of input and output. Figure 4.2 Top module of FIFO MUX MUX MUX MUX 4.2 Crossbar switch SEL10 OUT10 SEL3 OUT3 SEL2 OUT2 SEL1 IN10 OUT1 IN1 Figure 4.3 Internal Structure of Crossbar Switch The internal structure of crossbar switch is shown in figure 4.3. The crossbar switch of a router is the heart of the router data path. It switches the data from the input port to the output port doing the essence of the router function. The internal structure of a crossbar consists of an array of multiplexers. In this architecture it consists of ten 10:1 multiplexers. All the 10 inputs are connected to all the ten multiplexers. The data to be forwarded to the output depends on the select lines. The select lines are generated by the arbiter depending on the request signals. There are 10 select lines. So the output of each multiplexer depends on the select line of that multiplexer. 4.3 Arbiter The arbiter is for controlling the arbitration of ports and to resolve the contention issues. It knows the current status of all the ports, which ports are free, which ports are communicating with each other and in which ports the data contention can occur. Packets of same priority and destined for the same port is scheduled by a round robin algorithm. The arbiter can release the output port which is connected to the crossbar once it finishes the data transmission in that particular port. Then the port will be assigned to the next awaiting port in the queue. The arbiter generates an output signal which is given to the select line of the crossbar for selecting the corresponding port. The top module of round robin architecture is shown in figure 4.4. Figure 4.4 Top module of Round Robin Architecture Chapter 5 SIMULATION RESULTS Figure 5.1 Simulation of 10-port router in ModelSim6.3f. Figure 5.2 Dataflow model of 10 port router in ModelSim 6.3f Figure 5.3 Simulation of Round Robin Arbiter in ModelSim 6.3f Figure 5.4 Simulation of FIFO in ModelSim 6.3f Figure 5.5 RTL Schematic of 10-port Router in Xilinx ISE 14.4 Figure 5.6 Technology Schematic of 10-port Router in Xilinx ISE 14.4 Figure 5.7 Top Module of 10-port router in Xilinx ISE 14.4 Chapter 6 SYNTHESIS, POWER ANALYSIS & COVERAGE RESULTS 6.2 Delay analysis Delay : 2.571ns (Levels of Logic = 1) FDS : C->Q 11 0.591 0.968 out1_0 (out1_0) LUT4 : I3->O 1 0.704 0.000 out1_mux0000<0>641 (N786) FDS : D 0.308 out1_0 Total : 2.571ns (1.603ns logic, 0.968ns route) (62.3% logic, 37.7% route) 6.3 Design statistics IOs :171 BELS :124 LUT2 :5 LUT3 :15 LUT4 :104 VCC :1 Flip-flops/Latches :8 FDS :8 6.6 Power analysis results Table 6.6(a) Device Family Spartan3e Part xc3s500e Package fg320 Temp Grade Commercial Process Typical Speed Grade -4 Characterization PRODUCTION,v1.2,06-23-09 6.7 Coverage report Table 6.7 Coverage Report Enabled Coverage A ctive Hits % Covered Stmts 0 0 100 Branches 0 0 100 Conditions 0 0 100 Expressions 0 0 100 States 0 0 100 Transitions 0 0 100 Table 6.6(b) On-Chip Power Summary On-Chip Power (mW) Used Available Utilization (%) Clocks 0.00 1 - - Logic 0.00 123 9312 1 Signals 0.00 213 - - IOs 0.00 171 232 74 Static Power 80.98 Total 80.98 Table 6.6(c) Power Supply Currents Supply Source Supply Voltage Total Current (mA) Dynamic Current (mA) Quiescent Current (mA) Vccint 1.200 25.82 0.00 25.82 Vccaux 2.500 18.00 0.00 18.00 Vcco25 2.500 2.00 0.00 2.00 Table 6.6(d) Power Supply Summary Total Dynamic Static Power Supply Power (mW) 80.98 0.00 80.98 Chapter 7 ADVANTAGES 1. High Scalability. NoC architectures prove a better scalability as compared with bus architecture. 2. Reusability and allows integration of diverse technologies. 3. This 10 port router has less delay and high speed. 4. This 10 port router chooses shortest path for data transmission. 5. This proposed 10 port router is applicable for both 2D and 3D-NoC architectures. Chapter 8 CONCLUSION & FUTURE WORK 8.1 Conclusion The proposed design of 10 port router is simulated in ModelSim 6.3f and synthesized in Xilinx ISE 14.4 software. The coverage analysis is also done by ModelSim 6.3f. It has 100% statement coverage. This proposed design has less delay, less power consumption. This proposed 10 port router has delay 2.571ns (Levels of Logic = 1) and power consumption 80.98 mW. The delay and power is minimized by using crossbar switch. The main focus of our proposed work aimed at an efficient design of a router for NoC applications. The router is the most important component since it determines various network parameters like throughput and delay. 8.2 Future work The ultimate goal of this project is to develop routers for 3D-NoC. The work conducted so far is the first part of the whole project. Future work includes the extension of the router architectures and to construct an efficient NoC. The FPGA implementation of the NoC will also be done. REFERENCES [1] Khalid Latif et al, “Power and Area Efficient Design of Network-on-Chip Router Through Utilization of Idle Buffers”. Department of Information Technology, University of Turku, Finland. 2010 IEEE. 17th International Conference. [2] Brett Stanley Feero, “Networks-on-Chip in a three-Dimensional Environment: A Performance Evaluation” IEEE Transactions on computers, VOL. 58, NO. 1, January 2009. [3] Aamir Zia et al, “Highly-Scalable 3D CLOS NOC for Many-Core CMPs” ©2010 IEEE. [4] Terrence Mak et al, “Adaptive Routing in Network-on-Chips Using a Dynamic- Programming Network”, IEEE VOL. 58, NO. 8, August 2011. [5] Kun-Chih Chen et al, “Topology-Aware Adaptive Routing for Nonstationary Irregular Mesh in Throttled 3D NoC Systems”, IEEE VOL. 24, NO. 10, October 2013. [6] En-Jui Chang et al, “Path-Congestion-Aware Adaptive Routing with a Contention Prediction Scheme for Network-on-Chip Systems”, IEEE vol. 33, no. 1, January 2014. [7] M. Lakshmiswethlana, V. Lavanya, Dr. R.R Ramanareddy, “Design and Verification Eight Port Router for Network on Chip”, IEEE 2012.]]

References[edit]

  1. ^ "Balancing On-Chip Network Latency in Multi-Application Mapping for Chip-Multiprocessors". IPDPS. May 2014. 
  2. ^ NoCS 2007 website.
  3. ^ On-Chip Networks Bibliography
  4. ^ Inter/Intra-Chip Optical Network Bibliography
  5. ^ MCSL NoC Traffic Patterns

Adapted from Avinoam Kolodny's's column in the ACM SIGDA e-newsletter by Igor Markov
The original text can be found at http://www.sigda.org/newsletter/2006/060415.txt

External links[edit]