Multithreading and Concurrency

Lets start with questions. what, why and when the multithreading is required?
Writing a multithreaded program is easy, but ensuring data correctness and fair use of threads is important. This means all threads should perform an equal amount of work. Overburdening some threads while giving others less work can lead to inefficiencies. In simple terms, the workload should be evenly distributed among all threads. We will cover all these topics in this article.

Before going forward, let’s understand what multitasking is.

In an operating system, we can run multiple programs at the same time. For example, you can edit a document while printing, sending an email, or watching movies in media player.

But what is the difference between multithreading and multitasking?

Multithreading is like multitasking, but instead of running multiple programs at once, it’s about a single program performing multiple tasks simultaneously. Each task is called a thread. Programs that use multiple threads are called multithreaded programs.

For example:
A web browser can download multiple images while still allowing you to scroll..
A web server can handle requests from many users at the same time.
Java uses a thread for garbage collection (cleaning up memory) while your program runs.

Threads vs. Processes
A thread is a smaller unit of execution within a process (a set of instructions being executed). A process can have multiple threads, and all the running tasks shares the same memory. Threads within the same program share the same data. This makes threads faster and more efficient for tasks , but it can be risky if not handled carefully. It can lead to race conditions and data inconsistency. These issues can be avoided using locks. We will discuss locks in greater depth later in this article.
Example: In a browser, one thread handles page loading while another plays a video.

Now, let’s see how the CPU allocates resources to threads.

Based on this, we can decide when to use multithreading and determine how many threads can be created to make a program faster while using resources efficiently. Because creating the more threads is also counter productive which can makes the application slow because more context switching. is it confusing? I will explain this now in detail.

Very Important Point:
One core can run one thread at any given point in time. All the logic of multithreading depends on this condition. If you assign two threads to a single core, the other thread must wait until it gets CPU time. This process of switching between threads is called context switching. This process is so fast that it appears as if everything is happening simultaneously.

What is Context Switching?
Context switching is the process of saving the state of the current running process or thread and loading the state of another process or thread. The operating system performs context switching to switch between processes or threads, which is an essential part of a multithreaded program. While context switching is necessary, but it introduces overhead because the OS needs to save and restore the state of registers, memory mappings, and program counters. To minimize this overhead, modern operating systems use efficient context-switching algorithms and mechanisms like kernel preemption, which allows the OS to interrupt running processes and give other processes a chance to execute.

Impact on Application Performance : Context switching can significantly improve or reduce application performance. Consider the following cases:

When Threads Perform I/O Operations
Threads may perform I/O operations like reading/writing data to disk or making network calls. In such cases, context switching is beneficial because it allows other threads to utilize CPU time instead of leaving the CPU idle while waiting for I/O operations to complete.

When Threads Perform CPU-Bound Tasks
If threads are performing calculations or other tasks that involves image/video processing that require constant CPU time until completion of the task, context switching can introduces extra overhead. In such cases, frequent context switches can reduce system performance by wasting CPU cycles on saving and restoring thread states.

Who takes care of all these operation?
The operating system (OS) takes care of ensuring that multitasking and multithreading are effective and efficient by managing system resources like the CPU, memory, and I/O devices. Here's a detailed breakdown of how the OS handles these resources:

Process Scheduling
The OS manages CPU resources using a process scheduler. The scheduler ensures that multiple processes or threads are allocated CPU time in an efficient manner. The OS uses different scheduling algorithms, such as Round-Robin, First-Come-First-Serve (FCFS), Priority Scheduling, or more advanced ones like Multilevel Queue Scheduling. These algorithms determine how long each process or thread runs before the OS switches to another process.

To achieve multitasking, running multiple processes concurrently, the OS divides the CPU time into small units called time slices. Each process is assigned a time slice (or quantum) during which it can execute. Once a time slice is exhausted, the OS performs a context switch to another process. This context switch involves saving the state of the current process and loading the state of the next process to ensure that each process appears to run simultaneously, even if only one process is actually using the CPU at a time.

To implement multithreading, executing multiple threads of a single process concurrently, the OS treats each thread as an independent entity. The scheduler assigns time slices to each thread within a process, allowing the OS to manage the execution of individual threads.

Imagine the operating system as a super-efficient traffic controller managing multiple cars (in our context it processes) on a complex highway system. The CPU is like the main road, and processes are the vehicles trying to move forward. The operating system uses a scheduling mechanism called time-slicing, where each process gets a short burst of time on the CPU. It's similar to a roundabout where each car gets a brief moment to pass through, creating the illusion that everything is moving simultaneously.

Memory Management
The OS manages system memory(RAM) by ensuring that each process has its own isolated memory while efficiently sharing memory resources. The OS ensures that one process cannot access the memory allocated to another process, maintaining system stability and security. When RAM is full, the system uses a virtual memory mechanism, an advanced memory management technique that compensates for physical RAM limitations.

Virtual memory creates the illusion of having more memory than the physical RAM by using hard disk space as an extension of RAM. If the RAM is full than less-used memory pages(data is mapped to pages) are moved to the hard disk(swap space). When these pages are needed again, they are swapped back into RAM.

This mechanism enables more processes to run simultaneously, even with limited physical RAM.

Memory management works like a smart apartment complex. Each process gets its own apartment (memory space) that's completely separate from others. When there's not enough physical space, the operating system creates virtual memory - think of it like a storage unit where less frequently used items are temporarily moved out to make room for more important things. This allows multiple processes to run without interfering with each other.

I/O Management
I/O devices like hard drives, keyboards, monitors, etc. require careful management to ensure processes can access them without conflicts and to avoid blocking the CPU. The OS uses the following techniques:

Device Drivers:
The OS uses device drivers, which are specialized programs that enable communication between the OS and hardware devices. These drivers abstract the complexity of interacting with different devices and provide a uniform interface for the OS to manage devices.

I/O Scheduling:
The OS uses algorithms to manage I/O requests from multiple processes. Requests are queued and the OS ensures efficient access of processes to I/O devices by using scheduling policies like First-Come-First-Serve (FCFS) or Shortest Seek Time First (SSTF), which optimize access time based on the current state of the device.

Interrupts:
To prevent the CPU from constantly checking if an I/O operation is complete which would waste valuable processing time, the OS uses interrupts. When an I/O device is ready for the CPU to process its data, it sends an interrupt signal. The OS then pauses the current process and switches to handle the interrupt, ensuring efficient use of CPU time.

Synchronization
Multithreading allows multiple threads within the same process to run concurrently, sharing the process's memory and resources. Because threads share the same memory space. It may need to communicate or synchronize with each other to prevent race conditions or data corruption. To avoid these scenarios there are two mechanisms for protecting a code block from concurrent access. 1.Locks 2.Condition Objects

Locks: Locks ensure that shared data resources, such as data, files, or variables, are accessed safely and consistently by multiple threads. There are two types of locking mechanisms:

Pessimistic Locking: This approach uses lock objects to make the critical section of the code mutually exclusive to one thread at a time. This process is called a mutex(mutual exclusion). Commonly used in systems where write operations are frequent or conflicts are more likely to happen. Pros: Ensures data consistency by preventing concurrent updates. Cons: Reduces concurrency, as other threads must wait. which leads to reducing performance Risk of deadlocks if multiple locks block each other.

Optimistic Locking: This mechanism is for handling concurrency without using traditional locks. It uses mechanism called Compare and swap(CAS). when the thread reads the shared resource. It maintains (varibleName,currentVlaue(previous value), newValue(which is to update)). Before updating, it checks if the currentVlaue is unchanged. If the currentVlaue matches, the update succeeds. Otherwise, it retries or fails. You might get a doubt: What if two threads read the same resource at the same time and store the current value? While updating, wouldn’t they have the same current value?

This does not happen because these thread operations are atomic at the assembly code level. As a result, only one thread updates the resource at any given time.

Condition Objects:
Condition objects help manage threads that acquire a lock but cannot proceed until a certain condition is fulfilled (e.g., an account has sufficient funds for a transfer). Without a condition object, a thread holding a lock can prevent other threads from performing the operations needed to satisfy the condition (e.g., another thread might deposit the money).

This can lead to a deadlock scenario. Because the thread will release the lock only if the condition is met and other threads cannot enter the code block without the lock, so they will keep waiting.

With a condition object, the thread calls await() on the condition to wait until another thread signals that the condition may be met (e.g., by calling signalAll() after transferring funds).

If the condition (e.g., insufficient balance) is not met, the thread will wait by calling the await() method on a condition object. When a thread calls await(), it releases the lock it holds and enters the waiting state. This allows other threads to acquire the lock and continue their work, such as adding funds to the account.
This approach is safer, clearer, and more flexible than the traditional wait()/notify() method.

What if your service has multiple instances and are distributed across different machines, and you want to lock access between those distributed instances. you need to use a distributed locking mechanism. Systems like Zookeeper, Redis, or Consul can provide distributed locks, ensuring that only one instance can access a resource at a time, even if the instances are running on different servers. mutually exclusive to that particular instance

I hope you found this article useful. Stay tuned for more on system design topics!