Homepage › Forums › Articles › Programming › Quick x86 Assembly Info
Tagged: assembly, avx, cpu, float-point, fpu, instructions, intel, mmx, mnemonic, processor, programming, register, simd, sse, x86, x87
This topic was published by DevynCJohnson and viewed 2506 times since "". The last page revision was "".
- AuthorPosts
Datatype Suffixes
- z - zword; 512-bits
- u - upper-word; yword or zword
- y- yword; 256-bits
- h - half-word; qword, oword, or yword
- o - oword; 128-bits
- f - fourth-word; dword, qword, or oword
- n - normal-word; oword, yword, or zword
- x - xword; oword or yword
- t - ten-bytes; 80-bit float-point
- q - quadword; 64-bits
- l - longword/doubleword; 32-bit integer or 64-bit float-point
- w - word; 16-bits
- e - eighth-word; word, dword, or qword
- s - short; 16-bit integer or 32-bit float-point
- b - byte; 8-bits
Condition Codes
Code Bit 3 Bit 2 Bit 1 Bit 0 Condition O 0 0 0 0 overflow NO 0 0 0 1 no overflow B (NAE, C) 0 0 1 0 below (not above or equal, carry) NB (AE, NC) 0 0 1 1 not below (above or equal, no carry) E (Z) 0 1 0 0 equal (zero) NE (NZ) 0 1 0 1 not equal (not zero) NA (BE) 0 1 1 0 not above (below or equal) A (NBE) 0 1 1 1 above (not below or equal) S 1 0 0 0 sign NS 1 0 0 1 no sign P (PE) 1 0 1 0 parity (parity even) NP (PO) 1 0 1 1 no parity (parity odd) L (NGE) 1 1 0 0 less than (not greater than or equal) NL (GE) 1 1 0 1 not less than (greater than or equal) NG (LE) 1 1 1 0 not greater than (less than or equal) G (NLE) 1 1 1 1 greater than (not less than or equal) Control Registers
Control registers change or control the general behavior of the CPU, co-processor, or other digital device. Behaviors include interrupts, addressing mode, paging, and more.
- CR0
- 0 (PE) - Protected Mode Enabled; If 1, then system is in protected mode, else the system is in real mode
- 1 (MP) - Monitor co-processor; Controls interaction of WAIT/FWAIT instructions with TS flag in CR0
- 2 (EM) - Emulation; If set, no x87 floating point unit present, if clear, x87 FPU present
- 3 (TS) - Task Switched; Allows saving x87 task context upon a task switch only after x87 instruction used
- 4 (ET) - Extension Type; On the 386, specify whether the external math co-processor IS an 80287 or 80387
- 5 (NE) - Numeric Error; Enable internal x87 floating point error reporting when set, else enables PC style x87 error detection
- 16 (WP) - Write Protect; When set, the CPU can not write to read-only pages when privilege level is 0
- 18 (AM) - Alignment mask; Alignment check enabled if AM set, AC flag (in EFLAGS register) set, and privilege level is 3
- 29 (NW) - Not-write through; Globally enables/disable write-through caching
- 30 (CD) - Cache Disable; Globally enables/disable the memory cache
- 31 (PG) - Paging; If 1, enable paging and use the CR3 register, else disable paging
- CR1 - Reserved
- CR2 - Page Fault Linear Address (PFLA); When a page fault occurs, the address the program attempted to access is stored in the CR2 register.
- CR3 (PDBR) - Page Directory Base Register (Upper 20 bits). Used when virtual addressing is enabled to translate linear addresses into physical addresses by locating the page directory and page tables for the current task
- CR4 - Used in protected mode to control operations such as virtual-8086 support, enabling I/O breakpoints, page size extension and machine check exceptions.
- 0 (VME) - Virtual 8086 Mode Extensions; If set, enables support for the virtual interrupt flag (VIF) in virtual-8086 mode
- 1 (PVI) - Protected-mode Virtual Interrupts; If set, enables support for the virtual interrupt flag (VIF) in protected mode
- 2 (TSD) - Time Stamp Disable; If set, RDTSC instruction can only be executed when in ring 0, otherwise RDTSC can be used at any privilege level
- 3 (DE) - Debugging Extensions; If set, enables debug register based breaks on I/O space access
- 4 (PSE) - Page Size Extension; If unset, page size is 4 KiB, else page size is increased to 4 MiB (or 2 MiB with PAE set)
- 5 (PAE) - Physical Address Extension; If set, changes page table layout to translate 32-bit virtual addresses into extended 36-bit physical addresses
- 6 (MCE) - Machine Check Exception; If set, enables machine check interrupts to occur
- 7 (PGE) - Page Global Enabled; If set, address translations (PDE or PTE records) may be shared between address spaces
- 8 (PCE) - Performance-Monitoring Counter enable; If set, RDPMC can be executed at any privilege level, else RDPMC can only be used in ring 0
- 9 (OSFXSR) - Operating system support for FXSAVE and FXRSTOR instructions; If set, enables SSE instructions and fast FPU save & restore
- 10 (OSXMMEXCPT) - Operating System Support for Unmasked SIMD Floating-Point Exceptions; If set, enables unmasked SSE exceptions
- 13 (VMXE) - Virtual Machine Extensions Enable
- 14 (SMXE) - Safer Mode Extensions Enable; Trusted Execution Technology (TXT)
- 16 (FSGSBASE) - Enables the instructions RDFSBASE, RDGSBASE, WRFSBASE, and WRGSBASE
- 17 (PCIDE) - PCID Enable; If set, enables process-context identifiers (PCIDs)
- 18 (OSXSAVE) - XSAVE and Processor Extended States Enable
- 20 (SMEP) - Supervisor Mode Execution Protection Enable; If set, execution of code in a higher ring generates a fault
- 21 (SMAP) - Supervisor Mode Access Protection Enable; If set, access of data in a higher ring generates a fault
- 22 (PKE) - Protection Key Enable
- EFER (MSR 0xC0000080) - Extended Feature Enable Register; x86-64 only
- 0 (SCE) - System Call Extensions
- 8 (LME) - Long Mode Enable
- 10 (LMA) - Long Mode Active
- 11 (NXE) - No-Execute Enable
- 12 (SVME) - Secure Virtual Machine Enable
- 13 (LMSLE) - Long Mode Segment Limit Enable
- 14 (FFXSR) - Fast FXSAVE/FXRSTOR
- 15 (TCE) - Translation Cache Extension
- CR5-CR7 - Reserved
- CR8 (TPR) - Task-Priority Register; Prioritize external interrupts; x86-64 only
- CR9-CR15 - Reserved
Branching and Conditionals
Mnemonic Operand 1 Operand 2 Operand 3 Description Flags Modified BOUND limit src Array index in source register is checked against upper and lower bounds in memory source. The first word located at "limit" is the lower boundary and the word at "limit+2" is the upper array bound. Interrupt 5 occurs if the source value is less than or higher than the source. BT src dest The destination bit indexed by the source value is copied into the Carry Flag. CF BTC src dest The destination bit indexed by the source value is copied into the Carry Flag after being complimented (inverted). CF BTR src dest The destination bit indexed by the source value is copied into the Carry Flag and then cleared in the destination. CF BTS src dest The destination bit indexed by the source value is copied into the Carry Flag and then set in the destination. CF CALL dest Pushes Instruction Pointer (and Code Segment for far calls) onto stack and loads Instruction Pointer with the address of the procedure. CMOVcc src dest Conditional move; CMOVA, CMOVAE, CMOVB, CMOVBE, CMOVC, CMOVE, CMOVG, CMOVGE, CMOVL, CMOVLE, CMOVNA, CMOVNAE, CMOVNB, CMOVNBE, CMOVNC, CMOVNE, CMOVNG, CMOVNGE, CMOVNL, CMOVNLE, CMOVNO, CMOVNP, CMOVNS, CMOVNZ, CMOVO, CMOVP, CMOVPE, CMOVPO, CMOVS, CMOVZ CMP src dest Subtracts source from destination and updates the flags but does not save result. Flags can subsequently be checked for conditions. AF, CF, OF, PF, SF, ZF CMPS src dest Subtracts destination value from source without saving results. Updates flags based on the subtraction and the index registers (E)SI and (E)DI are incremented or decremented depending on the state of the Direction Flag. CMPSB inc/decrements the index registers by 1, CMPSW inc/decrements by 2, while CMPSD increments or decrements by 4. AF, CF, OF, PF, SF, ZF CMPSQ src dest CoMPare String Quadword CMPXCHG src dest Compares the accumulator with "dest". If equal the "dest" is loaded with "src", otherwise the accumulator is loaded with "dest". AF, CF, OF, PF, SF, ZF CMPXCHG16B CoMPare and eXCHanGe 16 Bytes ENTER level local Modifies stack for entry to procedure for high level language. Operand "locals" specifies the amount of storage to be allocated on the stack. "Level" specifies the nesting level of the routine. Paired with the LEAVE instruction, this is an efficient method of entry and exit to procedures. ESC src immediate Provides access to the data bus for other resident processors. The CPU treats it as a NOP but places memory operand on bus. HLT Halts CPU until RESET line is activated, NMI or maskable interrupt received. The CPU becomes dormant but retains the current CS:IP for later restart. INT int_num Initiates a software interrupt by pushing the flags, clearing the Trap and Interrupt Flags, pushing CS followed by IP and loading CS:IP with the value found in the interrupt vector table. Execution then begins at the location addressed by the new CS:IP IF, TF INTO If the Overflow Flag is set this instruction generates an INT 4 which causes the code addressed by 0000:0010 to be executed. IF, TF IRET Returns control to point of interruption by popping IP, CS and then the Flags from the stack and continues execution at this location. CPU exception interrupts will return to the instruction that cause the exception because the CS:IP placed on the stack during the interrupt is the address of the offending instruction. AF, CF, DF, IF, PF, SF, TF, ZF IRETQ 64-bit Return from Interrupt JA Jump if Above; CF=0 and ZF=0 JAE Jump if Above or Equal; CF=0 JB Jump if Below; CF=1 JBE Jump if Below or Equal; CF=1 or ZF=1 JC Jump if Carry; CF=1 JCXZ Jump if CX Zero;CX=0 JE Jump if Equal; ZF=1 JG Jump if Greater (signed); ZF=0 and SF=OF JGE Jump if Greater or Equal (signed); SF=OF JL Jump if Less (signed); SF != OF JLE Jump if Less or Equal (signed); ZF=1 or SF != OF JMP Unconditional Jump JNA Jump if Not Above; CF=1 or ZF=1 JNAE Jump if Not Above or Equal; CF=1 JNB Jump if Not Below; CF=0 JNBE Jump if Not Below or Equal; CF=0 and ZF=0 JNC Jump if Not Carry; CF=0 JNE Jump if Not Equal; ZF=0 JNG Jump if Not Greater (signed); ZF=1 or SF != OF JNGE Jump if Not Greater or Equal (signed); SF != OF JNL Jump if Not Less (signed); SF=OF JNLE Jump if Not Less or Equal (signed); ZF=0 and SF=OF JNO Jump if Not Overflow (signed); OF=0 JNP Jump if No Parity; PF=0 JNS Jump if Not Signed (signed); SF=0 JNZ Jump if Not Zero; ZF=0 JO Jump if Overflow (signed); OF=1 JP Jump if Parity; PF=1 JPE Jump if Parity Even; PF=1 JPO Jump if Parity Odd; PF=0 JRCXZ Jump if RCX is zero JS Jump if Signed (signed); SF=1 JZ Jump if Zero; ZF=1 LEAVE Releases the local variables created by the previous ENTER instruction by restoring SP and BP to their condition before the procedure stack frame was initialized. LOCK This instruction is a prefix that causes the CPU assert bus lock signal during the execution of the next instruction. Used to avoid two processors from updating the same data location. The 286 always asserts lock during an XCHG with memory operands. This should only be used to lock the bus prior to XCHG, MOV, IN and OUT instructions. LOOP label Decrements CX by 1 and transfers control to "label" if CX is not Zero. The "label" operand must be within -128 or 127 bytes of the instruction following the loop instruction. LOOPE/LOOPZ label Decrements CX by 1 (without modifying the flags) and transfers control to "label" if CX != 0 and the Zero Flag is set. The "label" operand must be within -128 or 127 bytes of the instruction following the loop instruction. LOOPNZ/LOOPNE label Decrements CX by 1 (without modifying the flags) and transfers control to "label" if CX != 0 and the Zero Flag is clear. The "label" operand must be within -128 or 127 bytes of the instruction following the loop instruction. MONITOR EDX ECX EAX Setup Monitor Address; Sets up a linear address range to be monitored by hardware and activates the monitor. MWAIT ECX EAX Monitor Wait; Processor hint to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. NOP Do nothing PAUSE Provides a hint to the processor that the following code is a spin loop; used for cacheability REP Repeats execution of string instructions while CX != 0. After each string operation, CX is decremented and the Zero Flag is tested. REPE/REPZ Repeats execution of string instructions while CX != 0 and the Zero Flag is set. CX is decremented and the Zero Flag tested after each string operation. REPNE/REPNZ Repeats execution of string instructions while CX != 0 and the Zero Flag is clear. CX is decremented and the Zero Flag tested after each string operation. RET/RETF/RETN num_bytes Transfers control from a procedure back to the instruction address saved on the stack. "num_bytes" is an optional number of bytes to release. Far returns pop the IP followed by the CS, while near returns pop only the IP register. RSM This was introduced by the i386SL and later and is also in the i486SL and later. Resumes from System Management Mode (SMM). SCAS Compares value at ES:DI (even if operand is specified) from the accumulator and sets the flags similar to a subtraction. DI is incremented/decremented based on the instruction format (or operand size) and the state of the Direction Flag. AF, CF, OF, PF, SF, ZF SETAE/SETNB dest Sets the byte in the operand to 1 if the Carry Flag is clear otherwise sets the operand to 0. SETB/SETNAE dest Sets the byte in the operand to 1 if the Carry Flag is set otherwise sets the operand to 0. SETBE/SETNA dest Sets the byte in the operand to 1 if the Carry Flag or the Zero Flag is set, otherwise sets the operand to 0. SETE/SETZ dest Sets the byte in the operand to 1 if the Zero Flag is set, otherwise sets the operand to 0. SETNE/SETNZ dest Sets the byte in the operand to 1 if the Zero Flag is clear, otherwise sets the operand to 0. SETL/SETNGE dest Set if Less / Set if Not Greater or Equal SETGE/SETNL dest Sets the byte in the operand to 1 if the Sign Flag equals the Overflow Flag, otherwise sets the operand to 0. SETLE/SETNG dest Sets the byte in the operand to 1 if the Zero Flag is set or the Sign Flag is not equal to the Overflow Flag, otherwise sets the operand to 0. SETG/SETNLE dest Sets the byte in the operand to 1 if the Zero Flag is clear or the Sign Flag equals to the Overflow Flag, otherwise sets the operand to 0. SETS dest Sets the byte in the operand to 1 if the Sign Flag is set, otherwise sets the operand to 0. SETNS dest Sets the byte in the operand to 1 if the Sign Flag is clear, otherwise sets the operand to 0. SETC dest Sets the byte in the operand to 1 if the Carry Flag is set, otherwise sets the operand to 0. SETNC dest Sets the byte in the operand to 1 if the Carry Flag is clear, otherwise sets the operand to 0. SETO dest Sets the byte in the operand to 1 if the Overflow Flag is set, otherwise sets the operand to 0. SETNO dest Sets the byte in the operand to 1 if the Overflow Flag is clear, otherwise sets the operand to 0. SETP/SETPE dest Sets the byte in the operand to 1 if the Parity Flag is set, otherwise sets the operand to 0. SETNP/SETPO dest Sets the byte in the operand to 1 if the Parity Flag is clear, otherwise sets the operand to 0. STI Sets the Interrupt Flag to 1, which enables recognition of all hardware interrupts. If an interrupt is generated by a hardware device, an End of Interrupt (EOI) must also be issued to enable other hardware interrupts of the same or lower priority. IF TEST src dest Performs a logical AND of the two operands updating the flags register without saving the result. AF, CF, OF, PF, SF, ZF UD2 Undefined Instruction; Generates an invalid opcode. This instruction is provided for software testing to explicitly generate an invalid opcode. The opcode for this instruction is reserved for this purpose. VERR src Verifies the specified segment selector is valid and is readable at the current privilege level. If the segment is readable, the Zero Flag is set, otherwise it is cleared. ZF VERW src Verifies the specified segment selector is valid and is writable at the current privilege level. If the segment is writable, the Zero Flag is set, otherwise it is cleared. ZF WAIT/FWAIT CPU enters wait state until the coprocessor signals it has finished its operation. This instruction is used to prevent the CPU from accessing memory that may be temporarily in use by the coprocessor. WAIT and FWAIT are identical. WBINVD Flushes internal cache, then signals the external cache to write back current data followed by a signal to flush the external cache. Data Manipulation
Mnemonic Operand 1 Operand 2 Operand 3 Description Flags Modified AAA ASCII adjust AL after addition; used when unpacked binary coded decimal AF, CF, OF, PF, SF, ZF AAD ASCII adjust AX before division AF, CF, OF, PF, SF, ZF AAM ASCII adjust AX after multiplication AF, CF, OF, PF, SF, ZF AAS ASCII adjust AL after subtraction AF, CF, OF, PF, SF, ZF ADC src dest Add with carry AF, CF, OF, PF, SF, ZF ADD src dest Add AF, CF, OF, PF, SF, ZF AND src dest Logical AND AF, CF, OF, PF, SF, ZF ARPL src dest Adjusted Requested Privilege Level of Selector; Compares the RPL bits of "dest" against "src". If the RPL bits of "dest" are less than "src", the destination RPL bits are set equal to the source RPL bits and the Zero Flag is set. Otherwise the Zero Flag is cleared. ZF BSF src dest Scans source operand for first bit set. Sets ZF if a bit is found set and loads the destination with an index to first set bit. Clears ZF if no bits are found set. ZF BSR src dest Scans source operand for first bit set. Sets ZF if a bit is found set and loads the destination with an index to first set bit. Clears ZF if no bits are found set. ZF BSWAP 32-bit Reg Changes the byte order of a 32 bit register from big endian to little endian or vice versa. Result left in destination register is undefined if the operand is a 16 bit register. CBW Convert byte in AL to a word in AX CDQ Converts a signed dword in EAX to a signed quadword in EDX:EAX CDQE Sign extend EAX into RAX CLC Clear the carry bit CF CLD Clear the direction flag DF CLI Disables the maskable hardware interrupts by clearing the Interrupt flag. NMI's and software interrupts are not inhibited. IF CLTS Clears the Task Switched Flag in the Machine Status Register. This is a privileged operation and is generally used only by operating system code. IF CMC Toggle the carry flag CF CQO Sign extend RAX into RDX:RAX CWD Extends sign of word in register AX throughout register DX forming a doubleword quantity in DX:AX. CWDE Converts a signed word in AX to a signed doubleword in EAX by extending the sign bit of AX throughout EAX. DAA Corrects result (in AL) of a previous BCD addition operation. Contents of AL are changed to a pair of packed decimal digits. AF, CF, OF, PF, SF, ZF DAS Corrects result (in AL) of a previous BCD subtraction operation. Contents of AL are changed to a pair of packed decimal digits. AF, CF, OF, PF, SF, ZF DEC dest Decrement AF, OF, PF, SF, ZF DIV src Unsigned binary division AF, CF, OF, PF, SF, ZF IDIV src Signed binary division AF, CF, OF, PF, SF, ZF IMUL immediate || src src (! immediate) dest Signed multiplication AF, CF, OF, PF, SF, ZF IN port accumulator A byte, word or dword is read from "port" and placed in AL, AX or EAX respectively. If the port number is in the range of 0-255 it can be specified as an immediate, otherwise the port number must be specified in DX. Valid port ranges on the PC are 0-1024, though values through 65535 may be specified and recognized by third party vendors and PS/2's. INC dest Increment AF, OF, PF, SF, ZF INS port dest Loads data from port to the destination ES:(E)DI (even if a destination operand is supplied). (E)DI is adjusted by the size of the operand and increased if the Direction Flag is cleared and decreased if the Direction Flag is set. For INSB, INSW, INSD no operands are allowed and the size is determined by the mnemonic. INVD Flushes CPU internal cache. Issues special function bus cycle which indicates to flush external caches. Data in write-back external caches is lost. INVLPG Invalidates a single page table entry in the Translation Look-Aside Buffer. MUL src Unsigned multiply of the accumulator by the source. If "src" is a byte value, then AL is used as the other multiplicand and the result is placed in AX. If "src" is a word value, then AX is multiplied by "src" and DX:AX receives the result. If "src" is a double word value, then EAX is multiplied by "src" and EDX:EAX receives the result. CF, OF, AF, PF, SF, ZF NEG dest Subtracts the destination from 0 and saves the 2s complement of "dest" back into "dest". CF, OF, AF, PF, SF, ZF NOT dest Inverts the bits of the "dest" operand forming the 1s complement. OR src dest Logical inclusive OR of the two operands returning the result in the destination. CF, OF, AF, PF, SF, ZF RCL count dest Rotates the bits in the destination to the left "count" times with all data pushed out the left side re-entering on the right. The Carry Flag holds the last bit rotated out. CF, OF RCR count dest Rotates the bits in the destination to the right "count" times with all data pushed out the right side re-entering on the left. The Carry Flag holds the last bit rotated out. CF, OF ROL count dest Rotates the bits in the destination to the left "count" times with all data pushed out the left side re-entering on the right. The Carry Flag will contain the value of the last bit rotated out. CF, OF ROR count dest Rotates the bits in the destination to the right "count" times with all data pushed out the right side re-entering on the left. The Carry Flag will contain the value of the last bit rotated out. CF, OF SAL/SHL count dest Shifts the destination left by "count" bits with zeros shifted in on right. The Carry Flag contains the last bit shifted out. AF, CF, OF, PF, SF, ZF SAR count dest Shifts the destination right by "count" bits with the current sign bit replicated in the leftmost bit. The Carry Flag contains the last bit shifted out. AF, CF, OF, PF, SF, ZF SBB src dest Subtracts the source from the destination, and subtracts 1 extra if the Carry Flag is set. Results are returned in "dest". AF, CF, OF, PF, SF, ZF SCASQ SCAn String Quadword SHR count dest Shifts the destination right by "count" bits with zeros shifted in on the left. The Carry Flag contains the last bit shifted out. AF, CF, OF, PF, SF, ZF SHLD/SHRD count src dest SHLD shifts "dest" to the left "count" times and the bit positions opened are filled with the most significant bits of "src". SHRD shifts "dest" to the right "count" times and the bit positions opened are filled with the least significant bits of the second operand. Only the 5 lower bits of "count" are used. AF, CF, OF, PF, SF, ZF STC Sets the Carry Flag to 1. CF STD Sets the Direction Flag to 1 causing string instructions to auto-decrement SI and DI instead of auto-increment. DF SUB src dest The source is subtracted from the destination and the result is stored in the destination. AF, CF, OF, PF, SF, ZF XADD src dest Exchanges the first operand with the second operand, then loads the sum of the two values into the destination operand. XOR src dest Performs a bitwise exclusive OR of the operands and returns the result in the destination. AF, CF, OF, PF, SF, ZF Data Transfer
Mnemonic Operand 1 Operand 2 Description Flags Modified CLFLUSH Cache Line Flush; Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy LAHF Copies flags AF, CF, PF, SF, and ZF into AH LAR src dest The high byte of the of the destination register is overwritten by the value of the access rights byte and the low order byte is zeroed depending on the selection in the source operand. The Zero Flag is set if the load operation is successful. ZF LDS src dest Loads 32-bit pointer from memory source to destination register and DS. The offset is placed in the destination register and the segment is placed in DS. To use this instruction the word at the lower memory address must contain the offset and the word at the higher address must contain the segment. This simplifies the loading of far pointers from the stack and the interrupt vector table. LEA src dest Transfers offset address of "src" to the destination register. LES src dest Loads 32-bit pointer from memory source to destination register and ES. The offset is placed in the destination register and the segment is placed in ES. To use this instruction the word at the lower memory address must contain the offset and the word at the higher address must contain the segment. This simplifies the loading of far pointers from the stack and the interrupt vector table. LFENCE Load fence; Serializes load operations LFS src dest Loads 32-bit pointer from memory source to destination register and FS. The offset is placed in the destination register and the segment is placed in FS. To use this instruction the word at the lower memory address must contain the offset and the word at the higher address must contain the segment. This simplifies the loading of far pointers from the stack and the interrupt vector table. LGDT src Loads a value from an operand into the Global Descriptor Table (GDT) register. LIDT src Loads a value from an operand into the Interrupt Descriptor Table (IDT) register. LGS src dest Loads 32-bit pointer from memory source to destination register and GS. The offset is placed in the destination register and the segment is placed in GS. To use this instruction the word at the lower memory address must contain the offset and the word at the higher address must contain the segment. This simplifies the loading of far pointers from the stack and the interrupt vector table. LLDT src Loads a value from an operand into the Local Descriptor Table Register (LDTR). LMSW src Loads the Machine Status Word (MSW) from data found at "src". LODS Transfers string element addressed by DS:SI (even if an operand is supplied) to the accumulator. SI is incremented based on the size of the operand or based on the instruction used. If the Direction Flag is set SI is decremented, if the Direction Flag is clear SI is incremented. Use with REP prefixes. LODSQ LOaD String Quadword LSL src dest Loads the segment limit of a selector into the destination register if the selector is valid and visible at the current privilege level. If loading is successful the Zero Flag is set, otherwise it is cleared. ZF LSS src dest Loads 32-bit pointer from memory source to destination register and SS. The offset is placed in the destination register and the segment is placed in SS. To use this instruction the word at the lower memory address must contain the offset and the word at the higher address must contain the segment. This simplifies the loading of far pointers from the stack and the interrupt vector table. LTR src Loads the current task register with the value specified in "src". MASKMOVDQU Masked Move of Double Quadword Unaligned; Stores selected bytes from the source operand (first operand) into a 128-bit memory location MASKMOVQ Masked Move of Quadword; Selectively write bytes from MM1 to memory location using the byte mask in MM2 MFENCE Memory Fence; Performs a serializing operation on all load and store instructions that were issued prior the MFENCE instruction. MOV src dest Copies byte or word from the source operand to the destination operand. MOVNTDQ Move Double Quadword Non-Temporal; Move double quadword from XMM to M128, minimizing pollution in the cache hierarchy. MOVNTI Move Doubleword Non-Temporal; Move doubleword from r32 to m32, minimizing pollution in the cache hierarchy. MOVNTPD Move Packed Double-Precision Floating-Point Values Non-Temporal; Move packed double-precision floating-point values from xmm to m128, minimizing pollution in the cache hierarchy. MOVNTPS Move Aligned Four Packed Single-FP Non Temporal; Move packed single-precision floating-point values from XMM to M128, minimizing pollution in the cache hierarchy. MOVNTQ Move Quadword Non-Temporal MOVS src dest Copies data from addressed by DS:SI (even if operands are given) to the location ES:DI destination and updates SI and DI based on the size of the operand or instruction used. SI and DI are incremented when the Direction Flag is cleared and decremented when the Direction Flag is Set. MOVSX src dest Copies the value of the source operand to the destination register with the sign extended. MOVSXD src dest MOV with Sign Extend 32-bit to 64-bit MOVZX src dest Copies the value of the source operand to the destination register with the zeros extended. OUT accum port Transfers byte in AL,word in AX or dword in EAX to the specified hardware port address. If the port number is in the range of 0-255 it can be specified as an immediate. If greater than 255 then the port number must be specified in DX. Since the PC only decodes 10 bits of the port address, values over 1023 can only be decoded by third party vendor equipment and also map to the port range 0-1023. OUTS src ports Transfers a byte, word or doubleword from "src" to the hardware port specified in DX. For instructions with no operands the "src" is located at DS:SI and SI is incremented or decremented by the size of the operand or the size dictated by the instruction format. When the Direction Flag is set SI is decremented, when clear, SI is incremented. If the port number is in the range of 0-255 it can be specified as an immediate. If greater than 255 then the port number must be specified in DX. Since the PC only decodes 10 bits of the port address, values over 1023 can only be decoded by third party vendor equipment and also map to the port range 0-1023. POP dest Transfers word at the current stack top (SS:SP) to the destination then increments SP by two to point to the new stack top. CS is not a valid destination. POPA/POPAD Pops the top 8 words off the stack into the 8 general purpose 16/32 bit registers. Registers are popped in the following order: (E)DI, (E)SI, (E)BP, (E)SP, (E)DX, (E)CX and (E)AX. The (E)SP value popped from the stack is actually discarded. POPF/POPFD Pops word/doubleword from stack into the Flags Register and then increments SP by 2 (for POPF) or 4 (for POPFD). POPFQ POP RFLAGS Register PREFETCH0 Prefetch into all cache levels PREFETCH1 Prefetch into all cache levels EXCEPT L1 PREFETCH2 Prefetch into all cache levels EXCEPT L1 and L2 PREFETCHNTA Prefetch into all cache levels to non-temporal cache structure PUSH src || immediate Decrements SP by the size of the operand (two or four, byte values are sign extended) and transfers one word from source to the stack top (SS:SP). PUSHA/PUSHAD Pushes all general purpose registers onto the stack in the following order: (E)AX, (E)CX, (E)DX, (E)BX, (E)SP, (E)BP, (E)SI, (E)DI. The value of SP is the value before the actual push of SP. PUSHF/PUSHFD Transfers the Flags Register onto the stack. PUSHF saves a 16 bit value while PUSHFD saves a 32 bit value. PUSHFQ PUSH RFLAGS Register RDMSR Load MSR specified by ECX into EDX:EAX. RDPMC Read the PMC [Performance Monitoring Counter]; Specified in the ECX register into registers EDX:EAX RDTSC Returns the number of processor ticks since the processor being "ONLINE" (since the last power on of system). RDTSCP ReaD Time Stamp Counter and Processor ID SAHF Transfers bits 0-7 of AH into the Flags Registers AF, CF, PF, SF, and ZF. AF, CF, PF, SF, ZF SFENCE Processor hint to make sure all store operations that took place prior to the SFENCE call are globally visible SGDT dest Stores the Global Descriptor Table (GDT) Register into the specified operand. SIDT dest Stores the Interrupt Descriptor Table (IDT) Register into the specified operand. SLDT dest Stores the Local Descriptor Table (LDT) Register into the specified operand. SMSW dest Store Machine Status Word (MSW) into "dest". STOS dest Stores value in accumulator to location at ES:(E)DI (even if operand is given). (E)DI is incremented/decremented based on the size of the operand (or instruction format) and the state of the Direction Flag. STOSQ STOre String Quadword STR dest Stores the current Task Register to the specified operand. SWAPGS Exchange GS base with KernelGSBase MSR WRMSR Write the value in EDX:EAX to MSR specified by ECX. XCHG src dest Exchanges contents of source and destination. XLAT/XLATB Replaces the byte in AL with byte from a user table addressed by BX. The original value of AL is the index into the translate table. The best way to describe this is MOV AL,[BX+AL] x87 Floating-Point Instructions
Mnemonic Description F2XM1 2x-1 FABS Absolute value FADD Add FADDP Add and pop FBLD Load BCD FBSTP Store BCD and pop FCHS Change sign FCLEX Clear exceptions FCOM Compare FCOMP Compare and pop FCOMPP Compare and pop twice FCOS Cosine FDECSTP Decrement floating-point stack pointer FDISI Divide FDIV Divide FDIVP Divide and pop FDIVR Divide reversed FDIVRP Divide reversed and pop FENI Enable interrupts; 8087 only, otherwise FNOP FFREE Free Register FIADD Integer add FICOM Integer compare FICOMP Integer compare and pop FIDIV Integer divide FIDIVR Integer divide reversed FILD Load integer FIMUL Integer multiply FINCSTP Increment floating point stack pointer FINIT Initialize floating point processor FIST Store integer FISTP Store integer and pop FISTTP x87 to integer truncation conversion regardless of status word; SSE3 FISUB Integer subtract FISUBR Integer subtract reversed FLD Floating point load FLD1 Load 1.0 onto stack FLDCW Load control word FLDENV Load environment state FLDENVD Load environment state, 32-bit FLDENVW Load environment state, 16-bit FLDL2E Load log2(e) onto stack FLDL2T Load log2(10) onto stack FLDLG2 Load log10(2) onto stack FLDLN2 Load ln(2) onto stack FLDPI Load π onto stack FLDZ Load 0.0 onto stack FMUL Multiply FMULP Multiply and pop FNCLEX Clear exceptions, no wait FNDISI Disable interrupts, no wait; 8087 only, otherwise FNOP FNENI Enable interrupts, no wait; 8087 only, otherwise FNOP FNINIT Initialize floating point processor, no wait FNOP No operation FNSAVE Save FPU state, no wait, 8-bit FNSAVEW Save FPU state, no wait, 16-bit FNSTCW Store control word, no wait FNSTENV Store FPU environment, no wait FNSTENVW Store FPU environment, no wait, 16-bit FNSTSW Store status word, no wait FPATAN Partial arctangent FPREM Partial remainder FPREM1 Partial remainder FPTAN Partial tangent FRNDINT Round to integer FRSTOR Restore saved state FRSTORD Restore saved state, 32-bit FRSTORW Restore saved state FSAVE Save FPU state FSAVED Save FPU state, 32-bit FSAVEW Save FPU state, 16-bit FSCALE Scale by factor of 2 FSETPM Set protected mode; 80287 only, otherwise FNOP FSIN Sine FSINCOS Sine and cosine FSQRT Square root FST Floating point store FSTCW Store control word FSTENV Store FPU environment FSTENVD Store FPU environment, 32-bit FSTENVD Store FPU environment, 32-bit FSTENVW Store FPU environment, 16-bit FSTP Store and pop FSTSW Store status word FSUB Subtract FSUBP Subtract and pop FSUBR Reverse subtract FSUBRP Reverse subtract and pop FTST Test for zero FUCOM Unordered compare FUCOMP Unordered compare and pop FUCOMPP Unordered compare and pop twice FWAIT Wait while FPU is executing FXAM Examine condition flags FXCH Exchange registers FXTRACT Extract exponent and significand FYL2X y*log2(x) FYL2XP1 y*log2(x+1) FCMOV Variants: FCMOVB, FCMOVBE, FCMOVE, FCMOVNB, FCMOVNBE, FCMOVNE, FCMOVNU, FCMOVU
FCOMI Variants: FCOMI, FCOMIP, FUCOMI, FUCOMIP
MMX Instructions
Mnemonic Description EMMS Empty MMX Technology State; Marks all x87 FPU registers for use by FPU MOVD Move doubleword MOVQ Move quadword PACKSSDW Pack doubleword to word (signed with saturation) PACKSSWB Pack word to byte (signed with saturation) PACKUSWB Pack word to byte (signed with unsaturation) PADDB Add packed byte integers PADDW Add packed word integers PADDD Add packed doubleword integers PADDSB Add packed signed byte integers and saturate PADDSW Add packed signed word integers and saturate PADDUSB Add packed unsigned byte integers and saturate PADDUSW Add packed unsigned word integers and saturate PAND Bitwise AND PANDN Bitwise AND NOT POR Bitwise OR PXOR Bitwise XOR PCMPEQB Compare packed byte integers for equality PCMPEQW Compare packed word integers for equality PCMPEQD Compare packed doubleword integers for equality PCMPGTB Compare packed signed byte integers for greater than PCMPGTW Compare packed signed word integers for greater than PCMPGTD Compare packed signed doubleword integers for greater than PMADDWD Multiply packed word integers, add adjacent doubleword results PMULHW Multiply packed signed word integers, store high 16 bit results PMULLW Multiply packed signed word integers, store low 16 bit results PSLLW Shift left word, shift in zeros PSLLD Shift left doubleword, shift in zeros PSLLQ Shift left quadword, shift in zeros PSRAD Shift right doubleword, shift in sign bits PSRAW Shift right word, shift in sign bits PSRLW Shift right word, shift in zeros PSRLD Shift right doubleword, shift in zeros PSRLQ Shift right quadword, shift in zeros PSUBB Subtract packed byte integers PSUBW Subtract packed word integers PSUBD Subtract packed doubleword integers PSUBSB Subtract packed signed byte integers with saturation PSUBSW Subtract packed signed word integers with saturation PSUBUSB Subtract packed unsigned byte integers with saturation PSUBUSW Subtract packed unsigned word integers with saturation PUNPCKHBW Unpack and interleave high-order bytes PUNPCKHWD Unpack and interleave high-order words PUNPCKHDQ Unpack and interleave high-order doublewords PUNPCKLBW Unpack and interleave low-order bytes PUNPCKLDQ Unpack and interleave low-order words PUNPCKLWD Unpack and interleave low-order doublewords SSE Instructions
- SSE SIMD Floating-Point Instructions: ADDPS, ADDSS, CMPPS, CMPSS, COMISS, CVTPI2PS, CVTPS2PI, CVTSI2SS, CVTSS2SI, CVTTPS2PI, CVTTSS2SI, DIVPS, DIVSS, LDMXCSR, MAXPS, MAXSS, MINPS, MINSS, MOVAPS, MOVHLPS, MOVHPS, MOVLHPS, MOVLPS, MOVMSKPS, MOVNTPS, MOVSS, MOVUPS, MULPS, MULSS, RCPPS, RCPSS, RSQRTPS, RSQRTSS, SHUFPS, SQRTPS, SQRTSS, STMXCSR, SUBPS, SUBSS, UCOMISS, UNPCKHPS, UNPCKLPS
- SSE SIMD Integer Instructions: ANDNPS, ANDPS, ORPS, PAVGB, PAVGW, PEXTRW, PINSRW, PMAXSW, PMAXUB, PMINSW, PMINUB, PMOVMSKB, PMULHUW, PSADBW, PSHUFW, XORPS
- SSE2 SIMD Floating-Point Instructions: ADDPD, ADDSD, ANDNPD, ANDPD, CMPPD, CMPSD*, COMISD, CVTDQ2PD, CVTDQ2PS, CVTPD2DQ, CVTPD2PI, CVTPD2PS, CVTPI2PD, CVTPS2DQ, CVTPS2PD, CVTSD2SI, CVTSD2SS, CVTSI2SD, CVTSS2SD, CVTTPD2DQ, CVTTPD2PI, CVTTPS2DQ, CVTTSD2SI, DIVPD, DIVSD, MAXPD, MAXSD, MINPD, MINSD, MOVAPD, MOVHPD, MOVLPD, MOVMSKPD, MOVSD*, MOVUPD, MULPD, MULSD, ORPD, SHUFPD, SQRTPD, SQRTSD, SUBPD, SUBSD, UCOMISD, UNPCKHPD, UNPCKLPD, XORPD
- SSE3 SIMD Floating-Point Instructions: ADDSUBPD, ADDSUBPS, HADDPD, HADDPS, HSUBPD, HSUBPS, MOVDDUP, MOVSHDUP, MOVSLDUP
- SSE3 SIMD Integer Instructions: LDDQU
- SSSE3 Instructions: PSIGNW, PSIGND, PSIGNB, PSHUFB, PMULHRSW, PMADDUBSW, PHSUBW, PHSUBSW, PHSUBD, PHADDW, PHADDSW, PHADDD, PALIGNR, PABSW, PABSD, PABSB
- SSE4.1 SIMD Floating-Point Instructions: DPPS, DPPD, BLENDPS, BLENDPD, BLENDVPS, BLENDVPD, ROUNDPS, ROUNDSS, ROUNDPD, ROUNDSD, INSERTPS, EXTRACTPS
- SSE4.1 SIMD Integer Instructions: MPSADBW, PHMINPOSUW, PMULLD, PMULDQ, PBLENDVB, PBLENDW, PMINSB, PMAXSB, PMINUW, PMAXUW, PMINUD, PMAXUD, PMINSD, PMAXSD, PINSRB, PINSRD/PINSRQ, PEXTRB, PEXTRW, PEXTRD, PEXTRQ, PMOVSXBW, PMOVZXBW, PMOVSXBD, PMOVZXBD, PMOVSXBQ, PMOVZXBQ, PMOVSXWD, PMOVZXWD, PMOVSXWQ, PMOVZXWQ, PMOVSXDQ, PMOVZXDQ, PTEST, PCMPEQQ, PACKUSDW, MOVNTDQA
- SSE4a Instructions: EXTRQ, INSERTQ, MOVNTSD, MOVNTSS
- SSE4.2 Instructions: CRC32, PCMPESTRI, PCMPESTRM, PCMPISTRI, PCMPISTRM, PCMPGTQ
FMA Instructions
Mnemonic Description VFMADDPD Fused Multiply-Add of Packed Double-Precision Floating-Point Values; VFMADDPD xmm0, xmm1, xmm2, xmm3 VFMADDPS Fused Multiply-Add of Packed Single-Precision Floating-Point Values VFMADDSD Fused Multiply-Add of Scalar Double-Precision Floating-Point Values VFMADDSS Fused Multiply-Add of Scalar Single-Precision Floating-Point Values VFMADDSUBPD Fused Multiply-Alternating Add/Subtract of Packed Double-Precision Floating-Point Values VFMADDSUBPS Fused Multiply-Alternating Add/Subtract of Packed Single-Precision Floating-Point Values VFMSUBADDPD Fused Multiply-Alternating Subtract/Add of Packed Double-Precision Floating-Point Values VFMSUBADDPS Fused Multiply-Alternating Subtract/Add of Packed Single-Precision Floating-Point Values VFMSUBPD Fused Multiply-Subtract of Packed Double-Precision Floating-Point Values VFMSUBPS Fused Multiply-Subtract of Packed Single-Precision Floating-Point Values VFMSUBSD Fused Multiply-Subtract of Scalar Double-Precision Floating-Point Values VFMSUBSS Fused Multiply-Subtract of Scalar Single-Precision Floating-Point Values VFNMADDPD Fused Negative Multiply-Add of Packed Double-Precision Floating-Point Values VFNMADDPS Fused Negative Multiply-Add of Packed Single-Precision Floating-Point Values VFNMADDSD Fused Negative Multiply-Add of Scalar Double-Precision Floating-Point Values VFNMADDSS Fused Negative Multiply-Add of Scalar Single-Precision Floating-Point Values VFNMSUBPD Fused Negative Multiply-Subtract of Packed Double-Precision Floating-Point Values VFNMSUBPS Fused Negative Multiply-Subtract of Packed Single-Precision Floating-Point Values VFNMSUBSD Fused Negative Multiply-Subtract of Scalar Double-Precision Floating-Point Values VFNMSUBSS Fused Negative Multiply-Subtract of Scalar Single-Precision Floating-Point Values AES Instructions
Mnemonic Description AESENC Perform one round of an AES encryption flow AESENCLAST Perform the last round of an AES encryption flow AESDEC Perform one round of an AES decryption flow AESDECLAST Perform the last round of an AES decryption flow AESKEYGENASSIST Assist in AES round key generation AESIMC Assist in AES Inverse Mix Columns Miscellaneous Instructions
Intel VT-x: VMPTRLD, VMPTRST, VMCLEAR, VMREAD, VMWRITE, VMCALL, VMLAUNCH, VMRESUME, VMXOFF, VMXON
ABM: LZCNT, POPCNT
BMI1: ANDN, BEXTR, BLSI, BLSMSK, BLSR, TZCNT
BMI2: BZHI, MULX, PDEP, PEXT, RORX, SARX, SHRX, SHLX
TBM: BEXTR, BLCFILL, BLCI, BLCIC, BLCMASK, BLCS, BLSFILL, BLSIC, T1MSKC, TZMSK
Further Reading
- Programming and Development - https://dcjtech.info/topic/programming-and-development/
- 8051 Microcontroller and Assembly - https://dcjtech.info/topic/8051-microcontroller-and-assembly/
- Hardware Registers - https://dcjtech.info/topic/hardware-registers/
- PDF Cheatsheet Downloads - https://dcjtech.info/topic/free-pdf-cheatsheets/
- Processor Article Forum - https://dcjtech.info/forum/articles/hardware/processors/
- Processor Article Guide/Index - https://dcjtech.info/topic/cpu-topics/
- AuthorPosts