BRA Branch; Motorola 680x0, Motorola 68300; short (16 bit) unconditional branch relative to the current program counter
JMP Jump; Motorola 680x0, Motorola 68300; unconditional jump (any valid effective addressing mode other than data register)
JMP Jump; Intel 80x86; unconditional jump (near [relative displacement from PC] or far; direct or indirect [based on contents of general purpose register, memory location, or indexed])
JMP Jump; MIX; unconditional jump to location M; J-register loaded with the address of the instruction which would have been next if the jump had not been taken
JSJ Jump, Save J-register; MIX; unconditional jump to location M; J-register unchanged
Jcc Jump Conditionally; Intel 80x86; conditional jump (near [relative displacement from PC] or far; direct or indirect [based on contents of general purpose register, memory location, or indexed]) based on a tested condition: JA/JNBE, JAE/JNB, JB/JNAE, JBE/JNA, JC, JE/JZ, JNC, JNE/JNZ, JNP/JPO, JP/JPE, JG/JNLE, JGE/JNL, JL/JNGE, JLE/JNG, JNO, JNS, JO, JS
Bcc Branch Conditionally; Motorola 680x0, Motorola 68300; short (16 bit) conditional branch relative to the current program counter based on a tested condition: BCC, BCS, BEQ, BGE, BGT, BHI, BLE, BLS, BLT, BMI, BNE, BPL, BVC, BVS
JOV Jump on Overflow; MIX; conditional jump to location M if overflow toggle is on; if jump occurs, J-register loaded with the address of the instruction which would have been next if the jump had not been taken
Condition codes are the list of possible conditions that can be tested during conditional instructions. Typical conditional instructions include: conditional branches, conditional jumps, and conditional subroutine calls. Some processors have a few additional data related conditional instructions, and some processors make every instruction
conditional. Not all condition codes available for a processor will be implemented for every conditional instruction.
Data movement instructions move data from one location to another. The source and destination locations are determined by the addressing modes, and can be registers or memory. Some processors have different instructions for loading registers and storing to memory, while other processors have a single instruction with flexible addressing modes.
Processors can broadly be divided into the categories of: CISC, RISC, hybrid, and special purpose.
Attributes are declarative tags in code that insert additional metadata into an assembly.
Assemblies are of two types:
Assemblies are made up of IL code modules and the metadata that describes them. Although programs may be compiled via an IDE or the command line, in fact, they are simply translated into IL, not machine code. The actual machine code is not generated until the function that requires it is called.
Each personal computer has a microprocessor that manages the computer's arithmetical, logical, and control activities.
Each family of processors has its own set of instructions for handling various operations such as getting input from keyboard, displaying information on screen and performing various other jobs. These set of instructions are called 'machine language instructions'.
A processor understands only machine language instructions, which are strings of 1's and 0's. However, machine language is too obscure and complex for using in software development. So, the low-level assembly language is designed for a specific family of processors that represents various instructions in symbolic code and a more understandable form.
Having an understanding of assembly language makes one aware of:
How programs interface with OS, processor, and BIOS;
How data is represented in memory and other external devices;
How the processor accesses and executes instruction;
How instructions access and process data;
How a program accesses external devices.
Other advantages of using assembly language are:
It requires less memory and execution time;
It allows hardware-specific complex jobs in an easier way;
It is suitable for time-critical jobs;
It is most suitable for writing interrupt service routines and other memory resident programs.
The main internal hardware of a PC consists of processor, memory, and registers. Registers are processor components that hold data and address. To execute a program, the system copies it from the external device into the internal memory. The processor executes the program instructions.
The fundamental unit of computer storage is a bit; it could be ON (1) or OFF (0). A group of nine related bits makes a byte, out of which eight bits are used for data and the
last one is used for parity. According to the rule of parity, the number of bits that are ON (1) in each byte should always be odd.
So, the parity bit is used to make the number of bits in a byte odd. If the parity is even, the system assumes that there had been a parity error (though rare), which might have been caused due to hardware fault or electrical disturbance.
The processor supports the following data sizes −
Word: a 2-byte data item
Doubleword: a 4-byte (32 bit) data item
Quadword: an 8-byte (64 bit) data item
Paragraph: a 16-byte (128 bit) area
Kilobyte: 1024 bytes
Megabyte: 1,048,576 bytes
Every number system uses positional notation, i.e., each position in which a digit is written has a different positional value. Each position is power of the base, which is 2 for binary number system, and these powers begin at 0 and increase by 1.
The value of a binary number is based on the presence of 1 bits and their positional value. So, the value of a given binary number is:
1 + 2 + 4 + 8 +16 + 32 + 64 + 128 = 255
which is same as 28 - 1.
Hexadecimal number system uses base 16. The digits in this system range from 0 to 15. By convention, the letters A through F is used to represent the hexadecimal digits corresponding to decimal values 10 through 15.
Hexadecimal numbers in computing is used for abbreviating lengthy binary representations. Basically, hexadecimal number system represents a binary data by dividing each byte in half and expressing the value of each half-byte.
Assembly language is dependent upon the instruction set and the architecture of the processor. In this tutorial, we focus on Intel-32 processors like Pentium. To follow this tutorial, you will need :
An IBM PC or any equivalent compatible computer
A copy of Linux operating system
A copy of NASM assembler program
There are many good assembler programs, such as :
Microsoft Assembler (MASM)
Borland Turbo Assembler (TASM)
The GNU assembler (GAS)
We will use the NASM assembler, as it is :
Free. You can download it from various web sources.
Well documented and you will get lots of information on net.
Could be used on both Linux and Windows.
If you select "Development Tools" while installing Linux, you may get NASM installed along with the Linux operating system and you do not need to download and install it separately. For checking whether you already have NASM installed, take the following steps −
Open a Linux terminal.
Type whereis nasm and press ENTER.
If it is already installed, then a line like, nasm: /usr/bin/nasm appears. Otherwise, you will see just nasm:, then you need to install NASM.
To install NASM, take the following steps :
Check The netwide assembler (NASM) website for the latest version.
Download the Linux source archive nasm-X.XX.ta.gz, where X.XX is the NASM version number in the archive.
Unpack the archive into a directory which creates a subdirectory nasm-X. XX.
cd to nasm-X.XX and type ./configure. This shell script will find the best C compiler to use and set up Makefiles accordingly.
Type make to build the nasm and ndisasm binaries.
Type make install to install nasm and ndisasm in /usr/local/bin and to install the man pages.
This should install NASM on your system. Alternatively, you can use an RPM distribution for the Fedora Linux. This version is simpler to install, just double-click the RPM file.
An assembly program can be divided into three sections −
The data section,
The bss section, and
The text section.
The data section is used for declaring initialized data or constants. This data does not change at runtime. You can declare various constant values, file names, or buffer size, etc., in this section.
The syntax for declaring data section is:
The bss section is used for declaring variables. The syntax for declaring bss section is :
The text section is used for keeping the actual code. This section must begin with the declaration global _start, which tells the kernel where the program execution begins.
The syntax for declaring text section is:
Assembly language programs consist of three types of statements −
Executable instructions or instructions,
Assembler directives or pseudo-ops, and
The executable instructions or simply instructions tell the processor what to do. Each instruction consists of an operation code (opcode). Each executable instruction generates one machine language instruction.
The assembler directives or pseudo-ops tell the assembler about the various aspects of the assembly process. These are non-executable and do not generate machine language instructions.
Macros are basically a text substitution mechanism.
Assembly language statements are entered one statement per line. Each statement follows the following format −
[label] mnemonic [operands] [;comment]
The fields in the square brackets are optional. A basic instruction has two parts, the first one is the name of the instruction (or the mnemonic), which is to be executed, and the second are the operands or the parameters of the command.
A segmented memory model divides the system memory into groups of independent segments referenced by pointers located in the segment registers. Each segment is used to contain a specific type of data. One segment is used to contain instruction codes, another segment stores the data elements, and a third segment keeps the program stack.
In the light of the above discussion, we can specify various memory segments as −
Data segment − It is represented by .data section and the .bss. The .data section is used to declare the memory region, where data elements are stored for the program. This section cannot be expanded after the data elements are declared, and it remains static throughout the program.
The .bss section is also a static memory section that contains buffers for data to be declared later in the program. This buffer memory is zero-filled.
Code segment − It is represented by .text section. This defines an area in memory that stores the instruction codes. This is also a fixed area.
Stack − This segment contains data values passed to functions and procedures within the program.
There are ten 32-bit and six 16-bit processor registers in IA-32 architecture. The registers are grouped into three categories −
Control registers, and
The general registers are further divided into the following groups −
Pointer registers, and
You can make use of Linux system calls in your assembly programs. You need to take the following steps for using Linux system calls in your program −
Put the system call number in the EAX register.
Store the arguments to the system call in the registers EBX, ECX, etc.
Call the relevant interrupt (80h).
The result is usually returned in the EAX register.
There are six registers that store the arguments of the system call used. These are the EBX, ECX, EDX, ESI, EDI, and EBP. These registers take the consecutive arguments, starting with the EBX register. If there are more than six arguments, then the memory location of the first argument is stored in the EBX register.
Most assembly language instructions require operands to be processed. An operand address provides the location, where the data to be processed is stored. Some instructions do not require an operand, whereas some other instructions may require one, two, or three operands.
When an instruction requires two operands, the first operand is generally the destination, which contains data in a register or memory location and the second operand is the source. Source contains either the data to be delivered (immediate addressing) or the address (in register or memory) of the data. Generally, the source data remains unaltered after the operation.
The three basic modes of addressing are −
In this addressing mode, a register contains the operand. Depending upon the instruction, the register may be the first operand, the second operand or both.
MOV DX, TAX_RATE ; Register in first operand
MOV COUNT, CX ; Register in second operand
MOV EAX, EBX ; Both the operands are in registers
As processing data between registers does not involve memory, it provides fastest processing of data.
An immediate operand has a constant value or an expression. When an instruction with two operands uses immediate addressing, the first operand may be a register or memory location, and the second operand is an immediate constant. The first operand defines the length of the data.
BYTE_VALUE DB 150 ; A byte value is defined
WORD_VALUE DW 300 ; A word value is defined
ADD BYTE_VALUE, 65 ; An immediate operand 65 is added
MOV AX, 45H; Immediate constant 45H is transferred to AX
Direct Memory Addressing
When operands are specified in memory addressing mode, direct access to main memory, usually to the data segment, is required. This way of addressing results in slower processing of data. To locate the exact location of data in memory, we need the segment start address, which is typically found in the DS register and an offset value. This offset value is also called effective address.
In direct addressing mode, the offset value is specified directly as part of the instruction, usually indicated by the variable name. The assembler calculates the offset value and maintains a symbol table, which stores the offset values of all the variables used in the program.
In direct memory addressing, one of the operands refers to a memory location and the other operand references a register.
The EQU directive is used for defining constants. The syntax of the EQU directive is as follows −
CONSTANT_NAME EQU expression
For example: TOTAL_STUDENTS equ 50
You can then use this constant value in your code, like −
mov ecx, TOTAL_STUDENTS
cmp eax, TOTAL_STUDENTS
The operand of an EQU statement can be an expression −
LENGTH equ 20
WIDTH equ 10
AREA equ length * width
Above code segment would define AREA as 200.
You can check by using "ps" command and "grep" command e.g. ps -ef | grep "myprocess". The keyword which you use with grep for search can be anything unique to your process, something which appears in its command line e.g. name of the class which implements the main method. You can also do "ps -ef | grep java" to list all Java process.
First, you need to find the PID of your process, which you can find by using the "ps" command as shown in the previous . Once you find the PID you can use the "top" command to find the CPU and memory usage. Alternatively, you can also use the prstat command as shown here.
These are parameters to specify heap size in Java. The -Xms defines the size of the heap when JVM starts up and -Xmx is used to specify the maximum heap size for Java application i.e. your heap cannot grow beyond that and JVM will die by throwing OutOfMemoryError if your heap doesn't have enough space to create new objects. See here to learn more about heap memory in Java.
The JRE stands for Java Runtime Environment and JVM stands for Java Virtual Machine. You install JRE to run Java application e.g. Applet or Core Java application or Web server like Tomcat. The JVM is part of JRE. See here to learn more differences between JVM and JRE.
The JVM stands for Java Virtual machine while JIT stands for Just in time Compiler. The JIT is part of JVM and used to convert the Java bytecode into native machine code which runs faster. There is some threshold set if a code runs more than the threshold it becomes the candidate of just in time compilation by JIT.
There are many ways to take the heap dump of a Java process e.g. Tomcat, but most common is by using tools available in JDK e.g. jVisualVM, jCmd, and jmap. Here is the command you can use to take the heap dump of Java process:
$ jmap -dump:live, file=/location/of/heap_dump.hprof PID
The heap dump will contain all live objects and they are stored in heap_dump.hprof file. Remember, you also need PID of Java process which you can find by using "ps" and "grep" command as discussed in the first . You can see Java Performance Companion by Charlie Hunt to learn more about taking and analyzing heap dump in Java to find memory leak and other memory related errors.
Taking thread dump is easier than taking heap dump because you don't need to remember tool. In Linux, you can just use the kill command to take the thread dump e.g.
$ kill -3 PID
will print the thread dump in the log file or where System.out is redirected. Similarly, in Windows, you can use Ctrl + Break from the command prompt. Alternatively, you can also use jConsole and VisualVM to take the thread dump of Java application in both Windows and Linux. You can also read Java Performance The Definitive Guide By Scott Oaks to learn more about thread dump and heap dump.
The Java virtual machine throws java.lang.OutOfMemoryError when there is not enough memory to run the application e.g. no more memory to create new objects, no more memory to create new threads etc. The most common OutOfMemoryError is the java.lang.OutOfMemoryError: java heap space, which comes when there is no more memory left to create a new object.
The main differences between 32-bit and 64-bit JVM are that later is designed for 64-bit operating system e.g. Windows 8 or later versions of Linux. From Java developer's perspective, the main difference between them comes from heap size. A 64-bit JVM virtually has unlimited heap memory as compared to 4GB of the theoretical limit of 32-bit JVM. If your program needs more memory, better run it on 64-bit JVM with large heap space. See here to learn more about 32-bit and 64-bit JVM.
5th April | 08:00 AM