Introduction to Compiler Design
Chapter 1 • Beginner
Introduction to Compiler Design
Compiler translates source code to target code. Understanding compiler phases is essential for GATE CS.
What is a Compiler?
Compiler is a program that translates source code (high-level language) to target code (machine code or intermediate code).
Phases:
- Lexical Analysis
- Syntax Analysis
- Semantic Analysis
- Intermediate Code Generation
- Code Optimization
- Code Generation
Compiler Phases
1. Lexical Analysis (Scanning)
Purpose: Convert source code to tokens.
Input: Source code (character stream)
Output: Tokens (keywords, identifiers, operators, etc.)
Example:
- Input: "int x = 5;"
- Tokens: KEYWORD(int), ID(x), OPERATOR(=), NUMBER(5), SEMICOLON
2. Syntax Analysis (Parsing)
Purpose: Build parse tree from tokens.
Input: Tokens
Output: Parse tree or syntax tree
Example:
- Tokens → Parse tree showing expression structure
3. Semantic Analysis
Purpose: Check semantic correctness.
Tasks:
- Type checking
- Scope resolution
- Symbol table management
4. Intermediate Code Generation
Purpose: Generate intermediate representation.
Forms:
- Three-address code
- Quadruples
- Triples
5. Code Optimization
Purpose: Improve code efficiency.
Optimizations:
- Constant folding
- Dead code elimination
- Loop optimization
6. Code Generation
Purpose: Generate target code.
Output: Machine code or assembly
Lexical Analysis
Lexical Analyzer (Scanner): Converts source to tokens.
Token: Smallest meaningful unit.
Token Types:
- Keywords: if, while, int
- Identifiers: variable names
- Operators: +, -, *, /
- Literals: numbers, strings
- Punctuation: ;, {, }
Regular Expressions: Used to specify token patterns.
Finite Automata: Used to recognize tokens.
Symbol Table
Symbol Table: Stores information about identifiers.
Information:
- Name
- Type
- Scope
- Address
Operations:
- Insert
- Lookup
- Delete
GATE CS Weightage
Compiler Design typically accounts for:
- 6-10 marks out of 100 in GATE CS
- Lowest priority subject
- Requires understanding of parsing and code generation
Important Topics for GATE CS
- Lexical Analysis (Medium Priority)
- Token recognition
- Regular expressions
- Parsing (High Priority)
- Top-down parsing
- Bottom-up parsing
- Syntax-Directed Translation (Medium Priority)
- SDT schemes
- Attribute grammars
- Code Generation (Low Priority)
- Three-address code
- Code optimization