Sunday, 7 July 2013

compiler

compiler is a computer program (or set of programs) that transforms source code written in a programming language (high language) (the source language) into another (machine language) computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program

The compiler scans the entire program first and then translates it into machine code which will be executed by the computer processor and the corresponding tasks will be performed. 
Compiler working
A compiler is likely to perform many or all of the following operations: lexical analysispreprocessingparsing, semantic analysis (Syntax-directed translation), code generation, and code optimization.


Any large software is easier to understand and implement if it is divided into well defined modules.


\begin{figure}%%
\htmlimage
\centering\includegraphics[scale=.4]{structureOfACompiler.eps}
\end{figure}


1.  Lexical analysis:-It is the process of breaking down the source files into key words, constants, identifiers, operators and other simple tokens.  A token is the smallest piece of text that the language defines.
  

2. Syntactical analysis:- It is the process of combining the tokens into
well-formed expressions, statements, and programs.  Each language has
specific rules about the structure of a program--called the grammar or
syntax.  Just like English grammar, it specifies how things may be put
together.  In English, a simple sentence is: subject, verb, predicate.
  

3. Semantic analysis:- It is the process of examining the types and values of the
statements used to make sure they make sense.  During the semantic
analysis, the types, values, and other required information about statements are recorded, checked, and transformed as appropriate to make sure the program makes sense.

For C/C++ in the line:
float x = "This is red"++

The semantic analysis would reveal the types do not match and can not be made to match, so the statement would be rejected and an error reported.

While in the statement:

float y = 5 + 3.0;

The semantical analysis would reveal that 5 is an integer, and 3.0 is a
double, and also that the rules for the language allow 5 to be converted to
a double, so the addition could be done, so the expression would then be
transformed to a double and the addition performed.  Then, the compiler
would recognize y as a float, and perform another conversion from the double

8.0 to a float and process the assignment.

4. Intermediate code generation:-In this process,depending on the compiler, this step may be skipped, and instead the program
may be translated directly into the target language (usually machine object code).  If this step is implemented, the compiler designers also design a machine independent language of there own that is close to machine language and easily translated into machine language for any number of different computers.

The purpose of this step is to allow the compiler writers to support
different target computers and different languages with a minimum of effort.
The part of the compiler which deals with processing the source files,
analyzing the language and generating the intermediate code is called the
front end, while the process of optimizing and converting the intermediate
code into the target language is called the back end.


5. Code optimization:-In this process the code generated is analyzed and improved for efficiency.  The compiler analyzes the code to see if improvements can be made to the intermediate code that couldn't be made earlier.  For example, some languages like Pascal do not allow pointers, while all machine languages do.  When accessing arrays, it is more efficient to use pointers, so the code optimizer may detect this case and internally use pointers.


6. Code generation:- Finally, after the intermediate code has been generated and optimized, the compiler will generated code for the specific target language.  Almost always this is machine code for a particular target machine.

Also, it us usually not the final machine code, but is instead object code,
which contains all the instructions, but not all of the final memory
addresses have been determined.

A subsequent program, called a linker is used to combine several different

object code files into the final executable program.

No comments:

Post a Comment