четверг, 24 ноября 2016 г.

Buffer Overflow Intruduction

BUFFER OVERFLOW
Buffer have data storage capaity. If the data count exceed the orignal, a buffer overflow occurs.
Buffers are developed to maintain finite (конечный) data; additional information can be directed wherever it is needed. The extra information may overflow into neighboring buffers, destroying or overwriting the legal data.

A heap is memory that is allocated dynamically; there are dynamically removed (example delete, free) and created (example new, malloc). Heaps are reallocated by the programmer. Each memory chunk (ломать) in a heap is associated with boundary tags containing information about memory management (управление памятью).

A stack is a contiguous block of memory (Стек представляет собой непрерывный блок памяти).

Registers EAX, EBX, ECX, EDX, ESI, EDI, EBP (32-bit registers) are called registers general purpose and are free to participate in any mathematical operations or operation memory access.
REGISTERS - components of the processor that stores data and address.

data section - initialized data or constants
bss section - variable declarations
text section - Storage code

Buffer overflow is an anomaly where a program. while writing data to a buffet, overruns the buffer's boundary and overwrites adjacent memory. There is a special case of violation of memory safety. Buffer overflows can be triggered by inputs that are designed to execute code, or alter the way the program operates. This may result in erratic program behavior, including memory access errors, incorrect results, a crash. or a breach of system security. Thus, they are the basis of many software vulnerabilities and can be maliciously exploited. In computer security and programming, a buffer overflow, or buffer overrun, vulnerability appears where an application needs to read external information such as a character string, the receiving buffer if relatively small compared to he possible size of the input string, and the application doesn't check the size. The buffer allocated at run-time is places in a stack, which keeps the information for executing functions, such as local variables, arguments variables, and the return address. The overflow strings can alter such information. This also means that an attacker can change the information as he wants to. The attacker can inject a series of machine language commands as a string that also leads to the execution of the attack code.

Buffer overflow
- A generic buffer overflow occurs when a program, tries to store more data in buffer than it was intended to hold
- When the Buffer Overflow see an example is compiled and run, an array "Buffer" of size 11 bytes is allocated to hold the "AAAAAAAAAAAAA" string

- strcpy() will copy the string "DDDDDDDDDDDDDDD" into the array "Buffer", which will exceed the buffer size of 11 bytes, resulting in buffer overflow

Scratch space - рабочее пространство
Dynamic space - динамическое пространство


Buffer overflow take place when data written to a buffer because of insufficient bounds checking corrupts the data values in memory addresses, which are adjacent to the allocated (выделенный) buffer. Most often this occurs when copying strings of characters from one buffer to another.

When the following program is compiled and run, it will assign a block of memory 11 bytes long to hold the attacker string, strcpy function will copy the string "DDDDDDDDDDD" into an attacker string, which will exceed (превышать) the buffer size of 11 byte, resulting in buffer overflow.
 Stack buffer overflow











Heap buffer overflow










BUFFER OVERFLOW
Buffer have data storage capacity. If the data count exceeds(превышать) the original, a buffer overflow occurs. Buffer are developed to maintain finite (конечный) data;  additional information can be directed wherever it is needed. The extra information may overflow into neighboring buffers, destroying or overwriting the legal data. For example, the following C program illustrate how a buffer overflow attack works, where an attacker easily manipulates the code:
























strcpy() /*function of C copies the 13 D charaters into the attacker buffer,  whose memory space is only 11 characters. Because there is no space for the  remaining "D" characters, it eats up the memory of the "target" buffer,  destroying the contents of the "target" buffer.*/

The example of Vulnerable Program
What all C functions are vulnerable to Buffer Overflow Exploit?
gets()
scanf()
sprintf()
strcpy()
strcat()
sprintf()

bcopy()



























Exploit BufferOverflow











Heap buffer overflow









STACK
A stack is a contiguous block of memory (Стек представляет собой непрерывный блок памяти).

How the memory is structure

CODE SEGMENT
When a program runs, both code and data are loaded into memory. The code refers to the area where the instructions for the program are located. This segment contains all the compiled executable code for the program. Write permission to this segment is disabled here (Разрешение на запись в этот сегмент отключен здесь, поэтому мы используем LordPE для редактирования записи и редактирования) as the code by itself does not contain any variable (переменные) and therefore has no need to write over itself. By having the read-only and execute attributes, the code can be shared between different copies of the program that are executing simultaneously.

DATA SEGMENT
The next section refers to the data, initialized and/or un-initialized, required by the running of the code instructions. The segment contains all the global data for the program. A read-write attribute is given, as programs would changed the global variables. There is no 'execute' attribute set, as global variables are not usually meant for execution.

STACK SEGMENT
Consider the stack as a single-ended data structure with first in, last out data ordering. This means that when two or more objects/elements are "pushed" onto the stack, to retrieve (извлекать) the first element, the subsequent (последующий) once have to be "popped"  (извлекать) off of the stack. In the other words, the most recent element remains on top of the stack. There is a lower memory address and higher memory address as one moves down the stack.

 Stack-based Buffer Overflow
Stack-based buffer overflows have been considered the common type of exploitable programming errors found is software applications. A stack overflow occurs when data is written past a buffer in the stack space, causing unpredictability that can often lead to compromise.

Over 100 functions within LibC have security implications (более 100 функций в библиотеке С имеют проблемы с безопасностью). The overflow can overwrite the return pointer (указатель) so that the flow of control switches to the malicious code. C language and its derivatives (производные) offer many ways to put more than anticipated (ожидаемые) into a buffer. 

Example - Simple uncontrolled overflow











UNDERSTANDING HEAP (КУЧА)


Heap - куча
Allocated - выделяется

The heap (куча) is an area of memory utilized by an application and allocated (выделяется) dynamically at runtime. It is common for a buffer overflow to occur in the heap memory space , and exploitation of these bugs is different  from stack-based buffer overflows. Heap overflows can be inconsistent and can have varying exploitation techniques and consequences.

Heap memory is different from Stack memory; in the heap, memory is persistent between functions (в куче, память повторяется между функциями), with memory allocated (выделяется) in one function remaining until explicitly freed. This means that a heap (куча) overflow can occur, but it is not notice until that section of memory is used later. There is no concept of saved EIP in relation to heap, but other important things are stored in the heap and can be broken by overwflowing dynamic buffers.

The heap consists of many blocks of memory, some of which are allocated (выделять) to the program and some are free, but allocated blocks are often placed in adjacent (смежный) places of memory.

HEAP-based Buffer Overflow

The heap (куча) is an area of memory utilized (память используется) by an application and allocated dynamically at runtime.

An application dynamically allocated (выделяется)  heap memory as needed. This allocation occurs through the function call malloc().
The malloc() function is called with an argument specifying the number of bytes to be allocated and returns a pointer to the allocated memory.

- Variables that are dynamically allocated with functions, such as malloc(), are created on the heap.
- An attacker overflows a buffer that is placed on the lower part of heap, overwriting other dynamically variables, which can have unexpected and unwanted effects.
- If an application copies data without first checking whether it fits into the target destination, the attacker could supply the application with a piece of data that is large, overwriting heap management information.
- In most environments, this may allow the attacker to control over the programs execution.
STACK OPERATION
A stack is implemented (выполняется) by the system for programs that run on the system. A variable can be deployed within the processor itself and memory can also be allocated (выделяется).
The variable is called the "register" and the region of the memory is the "stack".
The register used to refer to the stack as the "Stack Pointer" of SP.
The SP points to the top of the stack, while the bottom of the stack is a fixed address.
The kernel adjust the stack size dynamically at run time.

A stack (memory)  frame, or record, is an activation record that is stored on the stack. The stack frame has the following:
- the parameters to a function
- its local variables
- data required to restore the previous stack frame
- the value of the instruction pointer (pointer that points the next instruction to be fetch (вызывать) at the function call
- there are two major operation for stack ->  push (поместить в стек)  and pop (извлечь из стека).

When the program is loaded, the stack pointer is set to the highest address. This will be the topmost item in the stack. When an item is push onto the stack (когда величина проталкивается в стек), two events take place.
Subtracting (вычитание) the size of the item in bytes from the initial value of the pointer (указатель). Размер первоначальной величины уменьшается.
Next, all the bytes of the item in consideration are copied into region of the stack segment to which the stack pointer now points.

Similarly, when an item is popped (извлечь из стека) from stack, the size of the item in bytes is added (прибавление)to the stack pointer.

The copy of the item continues to reside (находится) on the stack. This will eventually be overwritten when the next push (поместить в стек) operation takes place. Based on the stack design implementation, the stack can come down  (упасть) towards lower memory addresses or go up (подняться) toward higher memory addresses.

When a procedure (function) is called, it is not the only item that pushes onto the stack. The address of the calling procedure's instruction immediately following the procedure call. As the called function completes, it would have popped (извлекается)  its own local variables off the stack. The stack will have the address of the next instruction of the calling procedure in it.

Apart the stack pointer, which points to the top of the stack, there is a frame pointer (FR) that points to a fixed location within a frame. Local variables (локальные переменные) are usually referenced by their offsets (смещение) from the stack pointer.
Base pointer - BP
Extended Base Pointer - EBP


Shellcode is a small code used as payload (нагрузка)in the exploitation of a Software vulnerability. Shellcode is a technique used to exploit stack-based overflows. Shellcodes exploit programming bugs (ошибка)in stack handling (обработка стека). Buffers are soft targets for attackers as they overflow easily if the condition match.
Buffer overflow shellcodes, written in assemble language, exploit vulnerabilities in stack and heap memory management.

For example, the VRFY command helps the attacker to identify potential users on the target system by verifying their email addresses. Sendmail uses a set user ID of root and runs with root priveleges. If the attacker connects to the sendmail daemon and sends a block of data consisting of 1,000 a's to the VRFY command, the VRFY buffer is overrun as it was only designed to hold 128 bytes.

The attacker can send a specific code that can overflow the buffer and execute the command /bin/sh.
Specific assembly code "egg" is transfer to the VRFY command, which part of the actual string used to overflow the buffer.
When the VRFY buffer is overrun, instead of the offending function returning to its original memory address, the attacker executes the malevolent machine code that was sent as a part of the buffer overflow data, which executes /bin/sh with root privileges.

 NOPs Operations (NOPs)
How to find the right address on the stack. If the attacker is off by one byte, more or less, there can be a segmentation violation or an invalid instruction. This can cause the system to crash. The attacker can increase the odds of finding the right address by padding (Забивать)code with NOP instruction.

A NOP is just a command telling the processor to do nothing other than take up time to process the NOP instruction itself. Almost all processors have a NOP instruction that performs a null operation.
NOP instruction -> 1 byte translate to 0x90 machine code.

A long run of NOP instructions is called a NOP slide or sled and CPU does nothing until it gets back to the main event ("return pointer").
By including NOPs in advance of the executable code, the attacker can avert (предотвращать) a segmentation violation if an overwritten return pointer lands execution in the NOPs. The program can continue to execute down the stack until it gets to the attacker's exploit. 

BUFFER OVERFLOW STEPS
Step 1. You should check whether the target application or program is vulnerable to buffer overflow or not. Typically buffer overflow occurs when the input entered exceeds (превышать) the size of the buffer.
If there is any potential buffer overflow vulnerability present in the program, then it displays an error then you enter a lengthy string (exceeding the size of buffer). Thus, you can confirm whether a program contains a buffer overflow vulnerability or not. If it is vulnerability, then find the location of the buffer overflow vulnerability.

Step 2. Once you find the location of the vulnerability, write more data into the buffer than it can handle. This causes the buffer overflow.

Step 3. When a buffer overflow occurs, it overwrites the memory. Using this advantage, you can overwrite the return address of a function with the address of the shellcode.

Step 4. When the overwrite occurs, the execution flow changes from normal to the shell code. Thus, you can execute anything you want.


Normal program














Buffer Overflow example


ATTACKING
When attacking the real program, need to know that a string function is being exploited, and send a long string as the input. After passing the input string, the sting overflows the buffer and causes a segmentation error. The return pointer of the function is overwritten, and the attacker succeeds in altering the flow execution (поток выполнения). What we need:
- know the exact address on the stack
- know the size of the stack
- make the return pointer (указатель возврата) to the code for execution

What the attacker faces are - must to know:
- determining the size of the buffer
- the address of the stack
- must know the address of the stack to get input to rewrite the return pointer.
- attacker must write a program small enough that it can be passed through as input.

The goal of the attacker is to get shell and use it to direct further commands.

The example of such program in C
The attacker can place arbitrary code (произвольный код) to be execute in the buffer that is to be overflowed and overwrite the return address so that it points back into the buffer.


Attacker must know the exact location in the memory space of the program whose code has to be exploited. A workaround (Обходной путь) for this challenge (вызов) is to use a jump (JMP)and a CALL instruction. These instructions allow relative addressing (относительная адресация) and permit the attacker to point to an offset (смещение) relative to the instruction pointer (IP). We need to know the exact address in the memory to which the exploit code must point.  
As most OS (operating systems) mark the code pages with the read-only attribute, this makes the our action workaround (обходной путь) an unfeasible (невыполнимым). !!!
Alternative is to place the code to be executed into the stack or data segment and transfer control to it.
One way of achieving this is to place the code in a global array in the data segment.
Any null code occurring in the shell code can be considered as the end of the string, and the code transfer can be terminated (прекращен).

FORMAT STRING PROBLEM
Format string problem occur when the input is supplied from untrusted source or when the data is passed as a format string argument to functions such as
syslog()
scanf()
gets()
sprintf()
strcpy()
strcat()
sprintf()
bcopy()

The format string vulnerabilities in C/C++ can easily be exploited because of the %n operator. If any program contains this kind of vulnerability, then the program's and access control may be at risk because the format string vulnerability exploitation results in information disclosure and execution (выполнение) of arbitrary code (произвольный код).




OVERFLOW USING FORMAT STRING
In C format string library functions take variable number of arguments. The format string contains format-directive characters and printable characters.
Format string overflow attacks are quite similar to the buffer overflow attacks. The attacker attempts to change the memory space and consequently (в следствие этого) runs arbitrary code (произвольный код).
The difference that attacker launches (запускать)the format string overflow attack by exploiting the vulnerabilities in the variadic functions, such as format functions.
Format string overflow can be exploited in four ways:
- memory viewing
- updating a word present in the memory
- making a buffer overflow by using minimum field size specific
- using %n format directive for overwriting the code


In C example BoF using format string problem:

char errmsg[512];
char outbuf[512];
sprintf(errmsg, "illegal command: %400s", user);
sprintf(outbug, errmsg);

If user = "%500<nop><shelcode>",
this will bypass ""%400s" limitation and overflow outbuf.
The stack smashing (сокрушать) buffer overflow attack is carried out.

Smashing the Stack
- The general idea is to overflow a buffer so that is overwrites the return address
- When the function is done, it will jump to whatever address is on the stack
- Put some code in the buffer and set the return address to point to it
- Buffer overflow allows us to change the return address of a function

Smashing the stack (обрушение стека) causes a stack to overflow. The stack is the first-in last-out form of buffer to hold the intermediate results of an operation.

Once the Stack is Smashed
There are two parts of the attacker's input: an injection vector and a payload. They may be separate or put together.
The injection vector is the correct entry-point that is tied along with the bug itself. It is OS/target/application/protocol/encoding-dependent. In the other hand, the payload is usually not tied to bugs at all and it contained by the attacker's skills. It also depends on machine, processor and so on.


Normal code













Heap buffer overflow









Stack buffer overflow


How to Mutate a Buffer Overflow Exploit
- for the NOP Portion
- apply XOR to combine code with random key unintelligible to IDS.
- return pointer


WRITING THE EXPLOIT OF BUFFER OVERFLOW

Theory
Every Windows application uses parts of memory.  The process memory contains 3 major components :

code segment (instructions that the processor executes.  The EIP keeps track of the next instruction)
data segment (variables, dynamic buffers)
stack segment (used to pass data/arguments to functions, and is used as space for variables. The stack starts (= the bottom of the stack) from the very end of the virtual memory of a page and grows down (to a lower address).  a PUSH adds something to the top of the stack, POP will remove one item (4 bytes) from the stack and puts it in a register.

If you want to access the stack memory directly, you can use ESP (Stack Pointer), which points at the top (so the lowest memory address) of the stack.

After a push, ESP will point to a lower memory address (address is decremented with the size of the data that is pushed onto the stack, which is 4 bytes in case of addresses/pointers). Decrements usually happen before the item is placed on the stack (depending on the implementation… if ESP already points at the next free location in the stack, the decrement happens after placing data on the stack)
After a POP, ESP points to a higher address (address is incremented (by 4 bytes in case of addresses/pointers)). Increments happen after an item is removed from the stack.

When a function/subroutine is entered, a stack frame is created. This frame keeps the parameters of the parent procedure together and is used to pass arguments to the subrouting.  The current location of the stack can be accessed via the stack pointer (ESP), the current base of the function is contained in the base pointer (EBP) (or frame pointer).

The CPU’s general purpose registers (Intel, x86) are :

EAX : accumulator : used for performing calculations, and used to store return values from function calls. Basic operations such as add, subtract, compare use this general-purpose register
EBX : base (does not have anything to do with base pointer). It has no general purpose and can be used to store data.
ECX : counter : used for iterations. ECX counts downward.
EDX : data : this is an extension of the EAX register. It allows for more complex calculations (multiply, divide) by allowing extra data to be stored to facilitate those calculations.
ESP : stack pointer
EBP : base pointer
ESI : source index : holds location of input data
EDI : destination index  : points to location of where result of data operation is stored
EIP : instruction pointer




















The text segment of a program image / dll is readonly, as it only contains the application code. This prevents people from modifying the application code. This memory segment has a fixed size. 
The data segment is used to store global and static program variables. The data segment is used for initialized global variables, strings, and other constants.
The data  segment is writable and has a fixed size.
The heap segment is used for the rest of the program variables. It can grow larger or smaller as desired.  All of the memory in the heap is managed by allocator (and deallocator) algorithms. A memory region is reserved by these algo’s.  The heap will grow towards a higher addresses.
In a dll, the code, imports (list of functions used by the dll, from another dll or application), and exports (functions it makes available to other dll’s applications) are part of the .text segment.

The Stack
The stack is a piece of the process memory, a data structure that works LIFO (Last in first out). A stack gets allocated by the OS, for each thread (when the thread is created).  When the thread ends, the stack is cleared as well.    The size of the stack is defined when it gets created and doesn’t change. Combined with LIFO and the fact that it does not require complex management structures/mechanisms to get managed, the stack is pretty fast, but limited in size.

When a stack is created, the stack pointer points to the top of the stack ( = the highest address on the stack). As information is pushed onto the stack, this stack pointer decrements (goes to a lower address).  So in essence, the stack grows to a lower address.

The stack contains local variables, function calls and other info that does not need to be stored for a larger amount of time.   As more data is added to the stack (pushed onto the stack), the stack pointer is decremented and points at a lower address value.

Every time a function is called, the function parameters are pushed onto the stack, as well  as the saved values of registers (EBP, EIP).  When a function returns, the saved value of EIP is retrieved from the stack and placed back in EIP, so the normal application flow can be resumed.

Комментариев нет:

Отправить комментарий