Part 1: The Central Processing Unit
- The Central Processing Unit is the primary component of a computer that performs most of the processing inside the computer.
- It is an electronic circuit that executes instructions comprising a computer program.
The Fetch-Decode-Execute Cycle
The CPU operates by repeatedly performing a cycle of three fundamental operations: Fetch, Decode, and Execute.
- Fetch: The CPU retrieves an instruction from memory
- The Program Counter (PC) specifies where in the memory is the instruction located.
- After fetching, the PC is updated to point to the next instruction in the memory.
- Decode: The fetched instruction is decoded by the Control Unit (CU).
- It involves translating the instruction into a series of signals that control the other parts of the CPU.
- Execute: The CPU carries out the instruction.
- This might involve:
- performing arithmetic or logical operations in the ALU,
- accessing data in memory,
- or controlling I/O devices.
- The result of the execution is often stored in the registers or written back in memory.
- This might involve:
Key Components
- Arithmetic Logic Unit (ALU):
- Performs arithmetic operations and logical operations.
- The workhorse of the CPU, performing the actual calculations of data.
- Control Unit (CU)
- Coordinates and controls all the activities of he CPU
- It fetches instructions, decodes them, and generates control signals to other CPU components.
- It ensures that instructions are executed in the correct order and that the data flows to the right places at the right time.
- Registers
- These are small, high-speed storage locations within the CPU.
- They often hold data and instructions that are actively processed
- They are much faster than the main memory.
- There are several types of registers:
- General Purpose Registers (GPRs): Used for storing data and intermediate results during calculations.
- Program Counter (PC): Holds the address of the next instruction to be fetched from memory.
- Instruction Register (IR): Holds the data that is being read or written from memory
- Memory Address Register (MAR): Holds the address of a memory location that is being accessed.
- Memory Data Register (MDR): Holds the data that is being read or written from memory.
- Status Register (Flags Register): Contains bits that indicate the status of the CPU and the results of the previous operations
Performance Metrics
- Clock Speed: Measures the number of clock cycles the CPU can execute per second, expressed in Hertz (Hz).
- Cores: A CPU can have many cores which can execute instructions independently.
- More cores = more multitasking = higher performance.
- Cache:
- Small, fast memory located within the CPU
- It stores frequently accessed data and instructions, reducing the need to access slower main memory.
- There are three levels of cache:
- L1: The smallest and fastest cache, located closest to the CPU cores.
- Typically split into L1 data cache and L1 instruction cache.
- L2: Larger and slower than L1, but still faster than RAM
- L3: The largest and slowest cache, shared by all cores in the CPU.
- L1: The smallest and fastest cache, located closest to the CPU cores.
- Instruction Set Architecture (ISA)
- It defines the set of instructions that a CPU can understand and execute.
- It specifies the format of instructions, the available data types, and the addressing modes.
- Instructions Per Cycle
- The average number of instructions a CPU can execute per clock cycle.
- Higher IPC = better performance
CPU Architectures
- x86
- The dominant architecture for desktop/laptop computers.
- Uses Complex Instruction Set Computing (CISC) architecture.
- It supports backward compatibility with older x86 processors.
- ARM
- The dominant architecture for mobile devices and embedded systems.
- Uses Reduced Instruction Set Computing (RISC) architecture
- More power-efficient than x86.
- RISC-V
- A free and open-source ISA
- Uses RISC
Other Terms
- Pipelining
- Instead of waiting for one instruction to completely finish, pipelining allows multiple instructions to be in different stages of execution.
- Example: For each clock cycle, the CPU may:
- Fetch
Instruction 1. - Fetch
Instruction 2, DecodeInstruction 1. - Fetch
Instruction 3, DecodeInstruction 2, ExecuteInstruction 1. - Fetch
Instruction 4, DecodeInstruction 3, ExecuteInstruction 2. - and so on…
- Fetch
- Superscalar Execution
- Here, the CPU can fetch, decode, or execute multiple instructions at the same time, provided that these instructions are independent of each other.
- This means that instructions can be executed in parallel, unlike in pipelining.
- Branch Prediction
- It is a process where the CPU tries to predict whether a branch should be taken or not
- If the prediction is incorrect, the pipeline has to be restarted, which hurts performance.
Part 2: Memory
RAM
- Random Access Memory (RAM)
- Readable/Writable: Data can be read from and written to RAM.
- Volatile: RAM loses its data when power is turned off.
- Primary Storage: Used as the main memory for the operating system, applications and data that are currently being used by the CPU.
- Fast Access: Provides fast access to any memory location.
Types of RAM
- Dynamic RAM (DRAM)
- Most common type of RAM in computers
- Stores data in a separate capacitor within an integrated circuit
- Requires periodic refreshing.
- Relatively slow
- Very cheap.
- Can store more data in the same amount of space (high density).
- Static RAM (SRAM)
- Faster than DRAM, but also more expensive
- Stores data using flip-flops
- It does not require refreshing
- Commonly used for CPU Cache
- DDR4 and DDR5
- It stands for Double Data Rate Synchronous Dynamic RAM
- It is an evolution of SDRAM
- SDRAM synchronizes operations with the system clock to improve performance.
- DDR memory transfers data on both the rising and falling edges of the clock signal, which doubles the data transfer rate.
- DDR4 vs. DDR5: DDR5 is the newest standard, which offers higher speeds, greater capacity, and improved power efficiency.
- It also incorporates on-die ECC (Error Correction Code) for reliability.
ROM
- Read-Only Memory (ROM)
- Read-only: Data can only be read from ROM
- Some types of ROM can be erased and reprogrammed under special conditions.
- Non-volatile: It retains its data when the power is off.
- Permanent Storage: Used to store firmware, boot code, and other essential software that needs to be available on boot.
- Slower
- Read-only: Data can only be read from ROM
Types of ROM
- Programmable ROM (PROM)
- Can only be programmed once.
- Once programmed, the data cannot be changed.
- Typically uses burning fuses to represent the desired data.
- Erasable Programmable ROM (EPROM)
- Can be erased and reprogrammed.
- Erasure involves exposing the chip to a UV light
- Electronically Erasable Programmable ROM (EEPROM)
- Can be erased and reprogrammed electrically, without the need for UV light
- EEPROMs can be erased and reprogrammed on the byte level
- Much flexible than EPROMs
- Flash Memory
- A type of EEPROM that is optimized for high-density storage.
- It can be erased and reprogrammed in blocks (rather than at byte level)
- Used in SSDs, USB drives, and memory cards.
Firmware and the Boot Process
- Firmware Storage: ROM is used to store firmware, which is software that is embedded in hardware. Firmware is essential for the basic operation of the device.
- Boot Process: ROM typically contains the boot code, which is the first code that is executed when the computer is turned on. This code initializes the hardware and loads the operating system from the hard drive into RAM.
- BIOS/UEFI: The BIOS (Basic Input/Output System) or UEFI (Unified Extensible Firmware Interface) is a type of firmware stored in ROM that provides a low-level interface for the hardware. It handles tasks such as system startup, hardware configuration, and basic input/output operations.
Other Terms
- Latency: The delay between when the CPU requests data from memory and when the data is actually available.
- Measured in nanoseconds (ns).
- Bandwidth: The rate at which the data can be transferred between the CPU and memory.
- Measured in bytes per second (B/s)
- Memory Hierarchy:
- Memory is structured in this way:
- Registers
- Cache (L1, L2, L3)
- RAM
- Disk (SSD/HDD)
- The closer the memory is to the CPU, the faster it is.
- Memory is structured in this way:
Part 3: Storage
There are two main types of storage:
- Primary: Volatile, fast, and used for actively running programs.
- Secondary: Non-volatile, slower than primary storage, and used for long-term storage of data and applications.
Hard Disk Drives (HDDs)
They store data on magnetically on rotating platters.
A read/write head moves across the surface of the platters to read and write data.
- Components
- Platters: Circular disks made of non-magnetic material covered with magnetic material.
- This is where data is stored.
- Read/Write Heads: Devices that read and write data to the platters
- Actuator Arm: Moves the read/write heads across the surface of the platters
- Spindle: Rotates the platters at a constant speed
- Platters: Circular disks made of non-magnetic material covered with magnetic material.
- Characteristics
- Capacity: Measured in bytes
- Revolutions Per Minute (RPM): The speed at which the platters rotate.
- Higher RPM = faster access time.
- Common speeds are 5400 RPM and 7200 RPM
- Access Time: The time it takes to read/write data to a specific location in the platter.
- Measured in milliseconds (ms)
- Latency (Rotational Latency): The average time it takes for the desired sector to rotate under the read/write head.
- Transfer Rate (Data Transfer Rate): The rate at which data can be transferred between the HDD and the computer.
- Measured in megabytes per second (MB/s).
Solid State Drives
SSDs store data in flash memory chips.
They have no moving parts and are much faster and durable than SSDs
-
Components:
- Flash Memory Chips: Store the data.
- Controller: Manages the flash memory chips, performs wear leveling, and handles data transfers.
- Cache (Optional): Some SSDs have a small amount of cache memory (typically DRAM) to improve performance.
-
Characteristics:
- Capacity: Measured in terabytes (TB).
- Access Time: Much faster access times compared to HDDs, typically measured in microseconds (µs).
- Transfer Rate (Data Transfer Rate): Much faster transfer rates compared to HDDs, typically measured in gigabytes per second (GB/s).
-
NVMe (Non-Volatile Memory Express):
- Interface Protocol: Not a storage device itself, but an interface protocol designed specifically for SSDs.
- PCIe Interface: NVMe SSDs connect directly to the PCIe bus, which provides much higher bandwidth than SATA.
- Significantly Faster: NVMe SSDs are significantly faster than SATA SSDs.
Other Storage Devices
- Optical Drives (CD, DVD, Blu-ray):
- Operation: Use lasers to read and write data to optical discs.
- CD (Compact Disc): Stores up to 700 MB of data.
- DVD (Digital Versatile Disc): Stores up to 4.7 GB (single-layer) or 8.5 GB (dual-layer) of data.
- Blu-ray Disc: Stores up to 25 GB (single-layer) or 50 GB (dual-layer) of data.
- Advantages: Portable, relatively inexpensive media.
- Disadvantages: Slower access times compared to HDDs and SSDs, limited storage capacity, susceptible to scratches and damage.
- USB Drives (Flash Drives):
- Operation: Use flash memory to store data.
- Advantages: Portable, convenient, relatively inexpensive.
- Disadvantages: Limited storage capacity compared to HDDs and SSDs, can be easily lost or damaged.
- Tape Drives:
- Operation: Store data sequentially on magnetic tape.
- Advantages: High capacity, relatively low cost per terabyte, used for archival storage and backups.
- Disadvantages: Very slow access times, sequential access only (not random access).
RAID Configuration
- RAIDs (Redundant Array of Independent Disks) are a technology that combines multiple physical hard drives or SSDs into a single logical unit to improve performance, redundancy, or both.
- There are a different number of RAID configurations depending on your use case:
- RAID 0: Improves performance by striping data across multiple drives.
- RAID 1: Provides redundancy by mirroring data on two or more drives.
- RAID 5: Improves performance with striping and provides redundancy by distributing parity information across the drives. If one drive fails, the data can be reconstructed from the parity information.
- RAID 10: Provides both mirroring and striping for high performance and redundancy
Part 4: Input/Output Devices
This section will simply focus on the mechanism behind input/output devices, skipping known sections.
Controllers and Drivers
- I/O Controllers
- These are hardware components that manage the communication between the CPU and an I/O device
- Each I/O device has its own controller.
- It handles tasks like data buffering, error detection, and protocol conversion.
- Device Driver
- It is a software that allows the operating system to communicate with an I/O device.
- It provides a software interface for the device.
- It translates generic OS commands into device-specific commands.
I/O Interfaces
- USB (Universal Serial Bus):
- A versatile interface for connecting a wide range of devices to the computer.
- Different USB versions exist: USB 1.0, USB 2.0, USB 3.0, USB 3.1, USB 3.2, USB4.
- Each version offers different data transfer speeds.
- USB Type-A, Type-B, Type-C are different connector types.
- USB Power Delivery (USB PD) allows devices to be charged via USB.
- HDMI (High-Definition Multimedia Interface):
- A digital interface for transmitting high-definition video and audio signals.
- Used to connect monitors, TVs, and other display devices to the computer.
- DisplayPort:
- Another digital interface for transmitting video and audio signals.
- Similar to HDMI, but often used in computer displays and professional applications.
- Supports higher refresh rates and resolutions than HDMI in some cases.
- Ethernet:
- A network interface for connecting computers to a network.
- Uses cables (typically Cat5e or Cat6) to transmit data.
- Different Ethernet standards exist (10BASE-T, 100BASE-TX, 1000BASE-T, 10GBASE-T) with different data transfer speeds.
- Thunderbolt:
- A high-speed interface that combines PCIe and DisplayPort into a single connector.
- Used for connecting high-performance peripherals such as external storage devices, displays, and docking stations.
- Different Thunderbolt versions exist (Thunderbolt 3, Thunderbolt 4) with different data transfer speeds.
Interrupts and Direct Memory Access
- Interrupts
- It is a signal sent from an I/O device to the CPU to request attention.
- When the CPU receives an interrupt, it suspends the current task and executes an interrupt handler.
- Interrupts allow I/O devices to communicate with the CPU.
- Direct Memory Access (DMA)
- These allow I/O devices to transfer data directly to or from memory without CPU involvement.
- The DMA manages the data transfer.
I/O Architectures
- Programmed I/O (PIO)
- The CPU directly controls the I/O device then reads and writes data to the device registers.
- Simple to implement but inefficient
- This is because the CPU is tied up during the entire I/O operation.
- Interrupt-Driven I/O
- The CPU initiates the I/O operation then continues with the other tasks
- When the I/O device is ready, it sends an interrupt to the CPU.
- The CPU handles the I/O request
- More efficient than PIO.
- Direct Memory Access
- The I/O device transfers data directly to or from memory without involving the CPU.
- The DMA controller manages the data transfer.
- Most efficient I/O architecture, as the CPU is not involved in the data transfer.
Lesson 5: Motherboard and Chipset
The Motherboard
It is the primary circuit board within a computer.
- It serves as the central hub that connects all the components of a computer, including the CPU, memory, storage devices, expansion cards, and I/O devices.
- It also distributes power to all of the components.
Chipset
These are the set of chips that control the communication between the CPU and the other components. It has two main architectures.
- Traditional: Used in older systems. The chipset was divided into two main chips, northbridge and southbridge.
- Northbridge
- Directly connected to the CPU.
- Controls high-speed communication via PCIe.
- Also known as the Memory Controller Hub (MCH), or the Integrated Memory Controller (IMC) (if the memory is integrated with the CPU).
- Southbridge
- Connected to the Northbridge
- Controls slower I/O devices such as USB, SATA, Ethernet, audio, and legacy devices
- Also known as the I/O Controller Hub (ICH)
- Northbridge
- Modern: The function of the Northbridge is integrated within the CPU itself. Some of the functions of Southbridge is handled by a single chip called the Platform Controller Hub.
- Platform Controller Hub (PCH)
- Handles most of the I/O functions previously managed by the Southbridge.
- Connects to the CPU via a high-speed interface.
- Platform Controller Hub (PCH)
Expansion Slots
These are the sockets on the motherboard that allow you to add expansion cards to the computer.
- Peripheral Component Interconnect Express (PCIe)
- The most common type of expansion slot
- Used for graphics cards, sound cards, network cards, storage controllers, and other peripherals.
- Different PCIe versions as well as PCIe lane configurations are available for different data transfer speeds and bandwidth.
- Serial ATA (SATA)
- Used to connect storage devices to the motherboard.
- M.2
- A small form factor connector for SSDs.
- Supports both SATA and PCIe interfaces.
System Bus and CPU Sockets
- System Bus
- A collection of electrical pathways on the motherboard that allow different components to communicate to each other.
- Some types of buses are:
- Memory Bus: Connects the CPU (or Northbridge) to the RAM.
- PCIe Bus: Connects the CPU (or chipset) to PCIe expansion slots.
- SATA Bus: Connects the chipset to SATA storage devices.
- USB Bus: Connects the chipset to USB ports.
- CPU Socket
- It is a connector on the motherboard that holds the CPU.
- Different CPU sockets are designed for different CPU architectures and manufacturers.
- It also determines which CPUs are compatible with the motherboard.
Motherboard Form Factors
These are a specification that defines the size, shape, mounting hole locations, power supply requirements, and other physical characteristics of a motherboard.
Some of the common form factors are:
- Advanced Technology Extended (ATX)
- the most common form factor for desktop computers.
- Offers good expansion capabilities and airflow.
- Micro-ATX:
- Smaller than ATX.
- Offers fewer expansion slots but can fit in smaller cases.
- Mini-ITX
- Even smaller than Micro-ATX
- Designed for small form factor computers and embedded systems.
- Typically has only one expansion slot.
BIOS/UEFI
- BIOS (Basic Input/Output System):
- Firmware stored on a ROM chip on the motherboard.
- The first software that runs when the computer is powered on.
- Performs a power-on self-test (POST) to check the hardware.
- Initializes the hardware and loads the operating system from the hard drive into RAM.
- Provides a low-level interface for the hardware.
- UEFI (Unified Extensible Firmware Interface):
- A more modern replacement for the BIOS.
- Offers several advantages over the BIOS, including a graphical user interface (GUI), support for larger hard drives, and improved security features.
- Supports secure boot, which helps prevent malware from loading during the boot process.
- Key Functions of BIOS/UEFI:
- POST (Power-On Self-Test): Checks the hardware components for errors during startup.
- Boot Loader: Loads the operating system from the storage device.
- Setup Utility: Allows users to configure hardware settings (e.g., boot order, clock speeds, fan speeds)
Part 6: Graphics Processing Unit
It is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device.
Primary Uses and Roles
- Rendering Graphics: It’s primary role is to accelerate the rendering of images, videos, and animations.
- It performs the complex calculations needed to transform 3D models into 2D images that can be displayed in a screen.
- Parallel Processing Power: GPUs are designed well for parallel processing, meaning they can perform many calculations respectively.
- Offloading from CPU: It takes the payload of graphics processing from the CPU, freeing it up to perform other tasks.
Types
There are two types of GPUs: Integrated and Discrete (Dedicated).
- Integrated GPU
- Built into the CPU or the motherboard chipset.
- Shares system memory with the CPU.
- Has lower power consumption.
- Less expensive.
- Suitable for basic graphics tasks
- Dedicated GPU
- A separate expansion card that plugs into the PCIe slot on the motherboard.
- Has its own dedicated memory.
- Higher power consumption
- More expensive
- Offers much better performance in gaming, video editing, and other graphics intensive tasks.
| Feature | Integrated GPU | Dedicated GPU |
|---|---|---|
| Location | CPU/Chipset | Expansion Card |
| Memory | Shared System RAM | Dedicated VRAM |
| Power Consumption | Lower | Higher |
| Performance | Lower | Higher |
| Cost | Lower | Higher |
Components
- Cores
- Has different names based on architecture:
- CUDA cores for NVIDIA.
- Stream Processors or Compute Units for AMD.
- These are the fundamental processing units within the GPU.
- Each core can execute instructions independently.
- Modern GPUs have hundreds or even thousands of cores
- Has different names based on architecture:
- Memory (VRAM - Video RAM)
- Dedicated memory used to store textures, frame buffers and other data need for graphics processing.
- Some common types of VRAM are GDDR6 and GDDR6X.
- Higher VRAM = complex scenes and resolutions
- Clock Speed
- The speed at which GPU cores operate.
- Higher clock speed = faster performance
GPU Architectures
- NVIDIA
- Dominant player in the GPU market
- GeForce RTX Series: High end GPUs for gaming and professional applications
- Tensor Cores: Specialized cores made for accelerating AI and ML tasks
- Ray Tracing: Has hardware support for ray tracing, a rendering technique that simulates realistic lighting.
- AMD
- NVIDIA’s major competitor
- Radeon RX Series: High-end GPUs for gaming
- FidelityFX Super Resolution (FSR): An upscaling technology that improves performance without sacrificing image quality
GPU APIs
- Open Graphics Library (OpenGL)
- A cross platform graphics API that can be used on a variety of operating systems (Windows, macOS, Linux).
- An open standard, meaning it is not controlled by any single company.
- DirectX
- A set of APIs developed by Microsoft for Windows
- Includes Direct3D, which is used for 3D graphics rendering.
Common Graphics Rendering Terms
- Shaders
- These are small programs that run on the GPU that determine how each pixel is rendered.
- Used to apply lighting, shadows, textures, and other scene effects.
- Has different types: vertex shaders, pixel shaders, geometry shaders.
- Textures
- These are images that are applied to the surfaces of 3D models to add detail and realism.
- These can be simple images or complex patterns.
- Has many texture filtering techniques which can be used to improve on the quality of the textures.
- Frame Buffer
- A memory buffer that stores the final image that will be displayed on the screen.
- The GPU renders the image from this buffer.
- The contents of the frame buffer are then sent to the monitor for display.
General Purpose GPUs
General Purpose GPUs are the use of GPUs to perform tasks outside of graphics rendering.
They are used in this way due to their immense ability to compute in parallel.
They can be used in tasks like:
-
Scientific Simulations
-
Machine Learning
-
Data Analysis
-
Cryptography
-
CUDA and OpenCL:
- CUDA (Compute Unified Device Architecture): A parallel computing platform and API developed by NVIDIA.
- OpenCL (Open Computing Language): An open standard for parallel programming that can be used on a variety of hardware platforms, including CPUs, GPUs, and FPGAs