Tuesday, May 28, 2013

Computer Organization and Architecture

Computer Organization
The question- ‘How does a computer work?’ is concerned with computer organization. Computer organization encompasses all physical aspects of computer systems. The way in which various circuits and structural components come together to make up fully functional computer systems is the way the system is organized.

                                                                       
                                                                                                                                 fig-1.1


Computer Architecture
Computer architecture is concerned with the structure and behavior of the computer system and refers to all the logical aspects of system implementation as seen through the eyes of a programmer. For example, in multithreaded programming, the implementation of mutexes and semaphores to restrict access to critical regions and to prevent race conditions among processes is an architectural concern.

                                                                                                                                                        fig-1.2

Therefore,
What does a computer do? = Architectural concern.

How does a computer work? = Organizational concern.

Taking automobiles as an example, making a car is a two-step process on a general level- 1) Making a logical design of the car, 2) Making a physical implementation of that design. The designer has to decide on everything starting from how to design the car to achieve maximum environmental friendliness and what materials to use in order to make the car cost effective and at the same time make it look good. All this falls under the category of architecture. When this design is actually implemented in a car manufacturing plant and a real car is built, we say that organization has taken place.


What are the benefits of studying computer architecture and organization?
Before delving into the technicalities, commonsense tells us that as a user it is almost perfectly all right to be operating a computer without understanding what the computer does internally to make ‘magical’ things happen on the screen, or how it does it.

if a computer science student (with a lack of knowledge on computer architecture and organization) were to write a code that does not comply with the internal architecture and organization of his/her computer then the computer would behave strangely and in the end the student would have to take it to some service center and blindly rely on those guys to fix the problems. If this is the case, then the only difference between a user and a computer science student is that the latter knows how to display the words- ‘Hello World’ onto the computer screen with a couple of different programming languages.

Say, for example, if a game developer were to design a game with the ‘frames per second’ property set to somewhere above 30 fps, his game-play would become much faster but his CPU usage would rise dramatically, making it very difficult for the CPU to do much else while the game is running and slowing down the computer noticeably. This is because the ‘frames per second’ value specifies how many times the screen gets updated every second. If the value is too large that would mean too many updates per second and therefore more and more of the power and attention of the CPU would have to be focused on running the game. If the value is too low, then CPU usage would fall remarkably, but the game would become much slower. Therefore, a value of about 30 would make the game considerably fast without using up too much of the CPU. Without this knowledge the game developer would ignorantly be making inefficient games that would struggle to make any commercial impacts.


What are the factors that prevent us from speeding up?

1) The fetch-execute cycle: The CPU generally fetches an instruction from memory and then executes it systematically. This whole process can be cumbersome if the computer does not implement a system such as ‘pipelining’. This is because in general the fetch-execute cycle consists of three stages:

a)      Fetch
b)      Decode
c)      Execute

Without a pipelining system, if the CPU is already dealing with a particular instruction then it will only consider another instruction once it is completely done executing the current one; there is no chance of concurrency. Therefore that slows down the computer considerably as each instruction has to wait for the other to finish first.
If, on the other hand, a pipelining system is implemented then the CPU could begin to fetch a new instruction while already decoding another. By the time it finishes executing the initial instruction, the second instruction will be ready to be immediately executed. In that way the CPU would be able to fetch, decode and execute more than one instruction per clock cycle.

  2) Hardware limitations: Each simple CPU traditionally consists of one ALU (Arithmetic Logic Unit), for example, and this restricts it to only being able to handle one instruction per clock cycle. With the rapid advancement in technology and the resultant rise in computing demands this is clearly not fast enough. Therefore a possible solution is to take the architecture to the superscalar level.

A superscalar architecture is that which consists of two or more functional units that can carry out two or more instructions per clock cycle. A CPU could be made with two ALU’s, for example.

                                                                                                                                                        fig-1.3

The figure above shows the micro-architecture of a processor. It consists of two ALU’s, an on-board FPU (Floating Point Unit) with its own set of floating point registers, and a BPU (Branch Prediction Unit) for handling branch instructions. The cache memory on top allows the processor to fetch instructions much faster .Clearly such a processor is built for speed. With its four execution units it could execute four instructions all at once.

  3) Parallelism: A typical computer with a single CPU can be tweaked to perform faster, but that performance will always be limited, because after all, how much can you really get out of a single processor? But imagine having more than one processor in a particular computer system.

Such systems do exist in the form of multiprocessors and parallel computers. They usually have between two and a few thousands of interconnected processors- each having its own private memory or having to share a common memory. The obvious advantage with such a system is that any task it undertakes, it can be shared equally amongst all the processors by allocating a certain part of the task to each respective processor.

For example, if a calculation were to take ten hours to complete on a conventional computer with one CPU, it would take far less time in a multiprocessor or parallel computer; i.e. - if the calculation could be split up into ten chunks, each requiring one hour to complete, then on a multiprocessor system with ten processors it would take only one hour for the entire calculation to complete because the different chunks would be run in parallel.
  
  4) Clock Speed and The Von Neumann Bottleneck: The clock speed of a computer tells us the rate at which the CPU operates. Some of the first microcomputers had clock speeds in the range of MHz (Megahertz). Nowadays we have speeds in the range of GHz (Gigahertz).

One possible solution to achieving greater speed would be to increase the clock speed. But however, it turns out that increasing the clock speed alone does not guarantee noticeable gains in speed and performance. This is because the speed of the processor is determined by the rate at which it can retrieve data and instructions from memory.

Suppose a particular task takes ten units of time to complete- eight units of time are spent waiting on memory and the remaining two are spent on processing. By doubling the clock speed without improving the memory access time, the processing time would be reduced from two to one unit of time (without any change in the memory access time). Therefore the overall gain in performance will be only ten percent because the overall time is now reduced from ten to nine units of time.

This limitation, caused by a mismatch in speed between the CPU and memory, is known as the Von Neumann Bottleneck. One simple solution to this problem is to install a cache memory between the CPU and main memory, which effectively increases memory access time. Cache memory has faster access time but comes with lower storage capacity as compared to main memory and is more expensive.


  5) Branch prediction: The processor acts like a psychic and predicts which branches or groups of instructions it has to deal with next by looking at the instruction code fetched from memory. In the best case scenario if the processor guesses correctly most of the time it can fetch the correct instructions beforehand and buffer them- this will allow the processor to remain busy most of the time. There are various algorithms for implementing branch prediction, some very complex, often predicting multiple branches beforehand. All of this is aimed at giving the CPU more work to do and therefore optimizing speed and performance.

Monday, February 4, 2013

Differences between processes and threads in operating systems.

A process is a program in execution, whereas a thread is a path of execution within a process. Processes are generally used to execute large, ‘heavyweight’ jobs such as running different applications, while threads are used to carry out much smaller or ‘lightweight’ jobs such as auto saving a document in a program, downloading files, etc. Whenever we double-click an executable file such as Paint, for instance, the CPU starts a process and the process starts its primary thread.

Fig 1.1- A thread within a process

Each process runs in a separate address space in the CPU. But threads, on the other hand, may share address space with other threads within the same process. This sharing of space facilitates communication between them. Therefore, Switching between threads is much simpler and faster than switching between processes

Threads also have a great degree of control over other threads of the same process. Processes only have control over child processes.

Also, any changes made to the main thread may affect the behavior of other threads within the same process. However, changes made to the parent processes do not affect the child processes.

Considering the similarities, a thread can do anything that a process can do. They are both able to reproduce (Processes create child processes and threads create more threads), and they both have ID, base priority, execution time and exit status as attributes.


                  
 























Fig 1.2- Life cycle of a process

To understand the actual differences between processes and threads, let us consider a real-life example. Suppose you are driving from city A to city B and you have had only three hours of sleep the night before after a terribly hectic day. Imagine yourself as the CPU and driving to city B as a process. While driving you may have to do a number of smaller tasks such as checking the map on the GPS, making sure that you don’t exceed speed limits, making sure you don’t fall asleep (trust me, it happens!), etc. These little tasks are actually threads of your process because collectively they serve to help in its execution. If any thread fails to execute properly, the entire process may be disrupted. (For instance, imagine what would happen if you actually fell asleep!)
Now let’s assume that there had been a storm the night before. After travelling some distance you see that a toppled over tree is blocking the road ahead. You cannot move any further unless the tree is removed. Therefore, your current process has to be stopped. Now you’ve got a brand new process in your hands- getting the tree out of the way.

So you gather people from a nearby village, which can be thought of as a thread to your new process, and shove the tree out of the way. Then you go back to your car and continue with your journey.

Fig 1.3- Check out my drawing skills!!

Let us take another example. Suppose you are cooking pasta in your kitchen. Once again, imagine that you are the CPU and cooking pasta is your process. Your threads this time will be rolling and cutting the pasta dough, preparing the sauce, grating cheese, etc. All of a sudden from the corner of your eye you notice something moving. You turn towards that direction and to your utter horror you see a huge rat! Now you cannot cook with the knowledge that there is a huge, nasty rat at large in your kitchen. So you stop doing whatever you were doing and set about accomplishing a new task, or a new process- Getting rid of the nasty little rodent. So you cut a little slice of cheese, take a stick and wait for the rat in ambush. All of these are your threads. Finally when the rat does show up, you try to hit it and miss terribly, but you do succeed in scaring it away from your kitchen. Satisfied, you peacefully get back to your cooking.

Considering threads in windows and linux
In Linux, there is no distinction between processes and threads; instead it simply uses the term task. On the contrary, in windows there are considerable differences between these two as has been mentioned above.

The limited distinction between threads and processes in Linux exists because the entire context of the process is not present in the main process data structure. Instead the context is contained in independent data structures. The process data structure simply contains pointers to these separate structures. In windows, on the other hand, all the data vital to the process are contained exclusively within the process itself.

In Linux, the scheduler represents a higher priority of execution by a lower number, whereas in Windows it is represented by a higher number.








References
-Modern Operating Systems by Andrew S. Tanenbaum and Albert S. Woodhull
-Operating System Principles by Abraham Silberschatz, Peter Baer Galvin and Greg Gagne
-http://www.programmerinterview.com/index.php/operating-systems/thread-vs-process
-www.wikipedia.com
-http://www.programmerinterview.com/index.php/operating-systems/thread-vs-process
-http://rossano.pro.br/fatec/cursos/soi/Linux.pdf
-http://avellano.fis.usal.es/~lalonso/amp_inf/windows.pdf
-http://codesam.blogspot.com/2011/03/introduction-to-thread-in-java-part-2.html