程序开始执行的地址,程序开始执行的地址
分类:巴黎人-操作系统

The exception table address and size. For more information, see section 6.5, “The .pdata Section.”

[PE结构分析] 5.IMAGE_OPTIONAL_HEADER,imageoptionalheader

结构体源代码如下:

typedef struct _IMAGE_OPTIONAL_HEADER 
{
    //
    // Standard fields.  
    //
+18h    WORD    Magic;                   // 标志字, ROM 映像(0107h),普通可执行文件(010Bh)
+1Ah    BYTE    MajorLinkerVersion;      // 链接程序的主版本号
+1Bh    BYTE    MinorLinkerVersion;      // 链接程序的次版本号
+1Ch    DWORD   SizeOfCode;              // 所有含代码的节的总大小
+20h    DWORD   SizeOfInitializedData;   // 所有含已初始化数据的节的总大小
+24h    DWORD   SizeOfUninitializedData; // 所有含未初始化数据的节的大小
+28h    DWORD   AddressOfEntryPoint;     // 程序执行入口RVA ***(必须了解)***
+2Ch    DWORD   BaseOfCode;              // 代码的区块的起始RVA
+30h    DWORD   BaseOfData;              // 数据的区块的起始RVA
    //
    // NT additional fields.    以下是属于NT结构增加的领域。
    //
+34h    DWORD   ImageBase;               // 程序的首选装载地址 ***(必须了解)***
+38h    DWORD   SectionAlignment;        // 内存中的区块的对齐大小 ***(必须了解)***
+3Ch    DWORD   FileAlignment;           // 文件中的区块的对齐大小 ***(必须了解)***
+40h    WORD    MajorOperatingSystemVersion;  // 要求操作系统最低版本号的主版本号
+42h    WORD    MinorOperatingSystemVersion;  // 要求操作系统最低版本号的副版本号
+44h    WORD    MajorImageVersion;       // 可运行于操作系统的主版本号
+46h    WORD    MinorImageVersion;       // 可运行于操作系统的次版本号
+48h    WORD    MajorSubsystemVersion;   // 要求最低子系统版本的主版本号
+4Ah    WORD    MinorSubsystemVersion;   // 要求最低子系统版本的次版本号
+4Ch    DWORD   Win32VersionValue;       // 莫须有字段,不被病毒利用的话一般为0
+50h    DWORD   SizeOfImage;             // 映像装入内存后的总尺寸
+54h    DWORD   SizeOfHeaders;           // 所有头 + 区块表的尺寸大小
+58h    DWORD   CheckSum;                // 映像的校检和
+5Ch    WORD    Subsystem;               // 可执行文件期望的子系统 ***(必须了解)***
+5Eh    WORD    DllCharacteristics;      // DllMain()函数何时被调用,默认为 0
+60h    DWORD   SizeOfStackReserve;      // 初始化时的栈大小
+64h    DWORD   SizeOfStackCommit;       // 初始化时实际提交的栈大小
+68h    DWORD   SizeOfHeapReserve;       // 初始化时保留的堆大小
+6Ch    DWORD   SizeOfHeapCommit;        // 初始化时实际提交的堆大小
+70h    DWORD   LoaderFlags;             // 与调试有关,默认为 0 
+74h    DWORD   NumberOfRvaAndSizes;     // 下边数据目录的项数,这个字段自Windows NT 发布以来一直是16
+78h    IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];   
// 数据目录表 ***(必须了解,重点)*** winNT发布到win10,IMAGE_NUMBEROF_DIRECTORY_ENTRIES一直都是16
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;

AddressOfEntryPoint  ***(必须了解)***

程序开始执行的地址,这是一个RVA(相对虚拟地址)。对于exe文件,这里是启动代码;对于dll文件,这里是libMain()的地址。如果在一个可执行文件上附加了一段代码并想让这段代码首先被执行,那么只需要将这个入口地址指向附加的代码就可以了。在脱壳时第一件事就是找入口点,指的就是这个值。

ImageBase  ***(必须了解)***

PE文件的优先装入地址。也就是说,当文件被执行时,如果可能的话(当前地址没有被使用),Windows优先将文件装入到由ImageBase字段指定的地址中。

对于EXE文件来说,由于每个文件总是使用独立的虚拟地址空间,优先装入地址不可能被**模块占据,所以EXE总是能够按照这个地址装入

这也意味着EXE文件不再需要重定位信息。

对于DLL文件来说,由于多个DLL文件全部使用宿主EXE文件的地址空间,不能保证优先装入地址没有被**的DLL使用,所以DLL文件中必须包含重定位信息以防万一。

因此,在前面介绍的 IMAGE_FILE_HEADER 结构的 Characteristics 字段中,DLL 文件对应的 IMAGE_FILE_RELOCS_STRIPPED 位总是为0,而EXE文件的这个标志位总是为1。

如果没有指定的话,dll文件默认为0x10000000;exe文件默认为0x00400000,但是在Windows CE平台上是0x00010000。此值必须是64K bytes的倍数!

SectionAlignment ***(必须了解)***

内存中区块的对齐单位。区块总是对齐到这个值的整数倍。此字段必须大于或等于 FileAlignment ,默认值是系统页面的大小。32位cpu通常值为 0x1000(十六进制),即4096,即4KB。64位cpu通常为 8kB
FileAlignment ***(必须了解)*****

pe文件中区块的对齐单位,以bytes(字节)为单位。此值必须是2的次方倍,但是必须在512和64K区间之间(闭区间[521, 64*1024=65536]),如果SectionAlignment小于系统页面的大小,那么SectionAlignment的大小就和FileAlignment相同。pe文件中默认值为 521 字节(0.5KB) 即 0x200(十六进制)。

Subsystem ***(必须了解)***

pe文件的用户界面使用的子系统类型。定义如下:

#define IMAGE_SUBSYSTEM_UNKNOWN              0   // 未知子系统
#define IMAGE_SUBSYSTEM_NATIVE               1   // 不需要子系统(如驱动程序)
#define IMAGE_SUBSYSTEM_WINDOWS_GUI          2   // Windows GUI 子系统
#define IMAGE_SUBSYSTEM_WINDOWS_CUI          3   // Windows 控制台子系统
#define IMAGE_SUBSYSTEM_OS2_CUI              5   // OS/2 控制台子系统
#define IMAGE_SUBSYSTEM_POSIX_CUI            7   // Posix 控制台子系统
#define IMAGE_SUBSYSTEM_NATIVE_WINDOWS       8   // 镜像是原生 Win9x 驱动程序
#define IMAGE_SUBSYSTEM_WINDOWS_CE_GUI       9   // Windows CE 图形界面

例如,Visual Studio 2015中编译程序时可以在图形界面设置链接选项:

更多请查看:

微软官方文档:

DataDirectory ***(必须了解,重要)***

这个字段可以说是最重要的字段之一,它由16个相同的IMAGE_DATA_DIRECTORY结构组成。其结构如下:

typedef struct _IMAGE_DATA_DIRECTORY {

   DWORD   VirtualAddress; // 相对虚拟地址 

   DWORD   Size;           // 数据块的大小

} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

也就是定义了某块的位置和大小。

虽然PE文件中的数据是按照装入内存后的页属性归类而被放在不同的节中的,但是这些处于各个节中的数据按照用途可以被分为导出表、导入表、资源、重定位表等数据块,这16个IMAGE_DATA_DIRECTORY结构就是用来定义多种不同用途的数据块的(如下表所示)。IMAGE_DATA_DIRECTORY结构的定义很简单,它仅仅指出了某种数据块的位置和长度。

#define IMAGE_DIRECTORY_ENTRY_EXPORT          0   // 导出表
#define IMAGE_DIRECTORY_ENTRY_IMPORT          1   // 导入表
#define IMAGE_DIRECTORY_ENTRY_RESOURCE        2   // 资源表
#define IMAGE_DIRECTORY_ENTRY_EXCEPTION       3   // 异常表(具体资料不详)
#define IMAGE_DIRECTORY_ENTRY_SECURITY        4   // 安全表(具体资料不详)
#define IMAGE_DIRECTORY_ENTRY_BASERELOC       5   // 重定位表
#define IMAGE_DIRECTORY_ENTRY_DEBUG           6   // 调试表
//      IMAGE_DIRECTORY_ENTRY_COPYRIGHT       7   // (X86 usage) 版权信息
#define IMAGE_DIRECTORY_ENTRY_ARCHITECTURE    7   // 版权信息
#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR       8   // RVA of GP (具体资料不详)
#define IMAGE_DIRECTORY_ENTRY_TLS             9   // TLS Directory (线程位置存储,具体资料不详)
#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10   // Load Configuration Directory (不详)
#define IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT   11   // Bound Import Directory in headers(不详)
#define IMAGE_DIRECTORY_ENTRY_IAT            12   // 导入函数地址表
#define IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT   13   // Delay Load Import Descriptors(不详)
#define IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR 14   // COM Runtime descriptor(不详)

] 5.IMAGE_OPTIONAL_HEADER,imageoptionalheader 结构体源代码如下: typedef struct _IMAGE_OPTIONAL_HEADER { // // Standard fields. // +18h WORD Magic; // 标志...

  • 使用SDK或Visual C++创建PE文件时,EXE默认的ImageBase为00400000,DLL默认10000000。使用DDK创建的SYS文件默认的ImageBase为10000。

  • Windows Vista之后的版本引入了ASLR安全机制,每次运行EXE文件都会被加载到随机地址,增强了系统安全性。

  • VC++中生成的PE文件的重定位节区名为.reloc,删除该节区后文件照常运行。

  • .reloc删除:

  • 首先在 IMAGE_SECTION_HEADER .reloc 处查看该节区头的长度和 .reloc 节区的偏移地址,以及 Virtual Size

  • 然后将 .reloc 的节区头中的值替换为0, .reloc 节区整个删除

  • 删除节区后,修改 IMAGE_FILE_HEADER 中的 Number of Sections 项。

  • 通过 IMAGE_OPTIONAL_HEADER - size of Image 修改映像值大小。

  • 需要减去的值根据之前记录的 Virtual Size 和 IMAGE_OPTIONAL_HEADER - Section Alignment 值扩展后所得。

  • 根据PE文件格式规范,IMAGE_NT_HEADERS的起始位置是“可变的”,由IMAGE_DOS_HEADER中的e_lfanew的值决定。一般拥有如下值(不同构建环境会有不同):

PEFILE.H

#define OPTHDROFFSET(a) ((LPVOID)((BYTE *)a                 + /
    ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE + /
    sizeof (IMAGE_FILE_HEADER)))

The optional header contains most of the meaningful information about the executable image, such as initial stack size, program entry point location, preferred base address, operating system version, section alignment information, and so forth. The IMAGE_OPTIONAL_HEADER structure represents the optional header as follows:

8

  • IMAGE_FILE_HEADER中的SizeOfOptionalHeader表示IMAGE_OPTIONAL_HEADER结构体的长度。另一层含义是确定节区头(IMAGE_SECTION_HEADER)的起始偏移。

  • 从IMAGE_OPTIONAL_HEADER的起始偏移加上SizeOfOptionalHeader的值的位置开始才是IMAGE_SECTION_HEADER

  • IMAGE_OPTIONAL_HEADER在32位PE32中大小为E0,64位PE32+中的大小为F0

  • Data_Directories中Import_Table为八个字节。前四个字节为导入表的地址(RVA),后四个字节为导入表的大小(SIZE)。如下图:导入表的RVA为271EE

WINNT.H

[cpp] view plain copy

 

  1. #define IMAGE_DOS_SIGNATURE             0x5A4D      // MZ  
  2. #define IMAGE_OS2_SIGNATURE             0x454E      // NE  
  3. #define IMAGE_OS2_SIGNATURE_LE          0x454C      // LE  
  4. #define IMAGE_NT_SIGNATURE              0x00004550  // PE00  

At first it seems curious that Windows executable file types do not appear on this list. But then, after a little investigation, the reason becomes clear: There really is no difference between Windows executables and OS/2 executables other than the operating system version specification. Both operating systems share the same executable file structure.

Turning our attention back to the Windows NT PE file format, we find that once we have the location of the file signature, the PE file follows four bytes later. The next macro identifies the PE file header:

The resource table address and size. For more information, see section 6.9, “The .rsrc Section.”

图片 1

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_RESOURCE_DIR_STRING_U {  
  2.     USHORT  Length;  
  3.     WCHAR   NameString[ 1 ];  
  4. } IMAGE_RESOURCE_DIR_STRING_U, *PIMAGE_RESOURCE_DIR_STRING_U;  

This structure is simply a 2-byte Length field followed by Length UNICODE characters.

On the other hand, if the most significant bit of the Name field is clear, the lower 31 bits are used to represent the integer ID of the resource. Figure 2 shows the menu resource as a named resource and the string table as an ID resource.

If there were two menu resources, one identified by name and one by resource, they would both have entries immediately after the menu resource directory. The named resource entry would appear first, followed by the integer-identified resource. The directory fields NumberOfNamedEntries and NumberOfIdEntries would each contain the value 1, indicating the presence of one entry.

Below level two, the resource tree does not branch out any further. Level one branches into directories representing each type of resource, and level two branches into directories representing each resource by identifier. Level three maps a one-to-one correspondence between the individually identified resources and their respective language IDs. To indicate the language ID of a resource, the Name field of the directory entry structure is used to indicate both the primary language and sublanguage ID for the resource. The Win32 SDK for Windows NT lists the default value resources. For the value 0x0409, 0x09 represents the primary language as LANG_ENGLISH, and 0x04 is defined as SUBLANG_ENGLISH_CAN for the sublanguage. The entire set of language IDs is defined in the file WINNT.H, included as part of the Win32 SDK for Windows NT.

Since the language ID node is the last directory node in the tree, the OffsetToData field in the entry structure is an offset to a leaf node—the IMAGE_RESOURCE_DATA_ENTRY structure mentioned earlier.

Referring back to Figure 2, you can see one data entry node for each language directory entry. This node simply indicates the size of the resource data and the relative virtual address where the resource data is located.

One advantage to having so much structure to the resource data section, .rsrc, is that you can glean a great deal of information from the section without accessing the resources themselves. For example, you can find out how many there are of each type of resource, what resources—if any—use a particular language ID, whether a particular resource exists or not, and the size of individual types of resources. To demonstrate how to make use of this information, the following function shows how to determine the different types of resources a file includes:

  96/112

e_lfanew = MZ文件头大小(40) + DOS存根大小(可变:VC++下为A0) = E0

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_RESOURCE_DIRECTORY {  
  2.     ULONG   Characteristics;  
  3.     ULONG   TimeDateStamp;  
  4.     USHORT  MajorVersion;  
  5.     USHORT  MinorVersion;  
  6.     USHORT  NumberOfNamedEntries;  
  7.     USHORT  NumberOfIdEntries;  
  8. } IMAGE_RESOURCE_DIRECTORY, *PIMAGE_RESOURCE_DIRECTORY;  

Looking at the directory structure, you won't find any pointer to the next nodes. Instead, there are two fields, NumberOfNamedEntries and NumberOfIdEntries , used to indicate how many entries are attached to the directory. By attached , I mean the directory entries follow immediately after the directory in the section data. The named entries appear first in ascending alphabetical order, followed by the ID entries in ascending numerical order.

A directory entry consists of two fields, as described in the following IMAGE_RESOURCE_DIRECTORY_ENTRY structure:

104/120

Debug information section, .debug

Debug information is initially placed in the .debug section. The PE file format also supports separate debug files (normally identified with a .DBG extension) as a means of collecting debug information in a central location. The debug section contains the debug information, but the debug directories live in the .rdata section mentioned earlier. Each of those directories references debug information in the .debug section. The debug directory structure is defined as an IMAGE_DEBUG_DIRECTORY , as follows:

The import address table address and size. For more information, see section 6.4.4, “Import Address Table.”

Import data section, .idata

The .idata section is import data, including the import directory and import address name table. Although an IMAGE_DIRECTORY_ENTRY_IMPORT directory is defined, no corresponding import directory structure is included in the file WINNT.H. Instead, there are several other structures called IMAGE_IMPORT_BY_NAME, IMAGE_THUNK_DATA, and IMAGE_IMPORT_DESCRIPTOR. Personally, I couldn't make heads or tails of how these structures are supposed to correlate to the .idata section, so I spent several hours deciphering the .idata section body and came up with a much simpler structure. I named this structure IMAGE_IMPORT_MODULE_DIRECTORY .

Bound Import

Introduction

The recent addition of the Microsoft® Windows NT™ operating system to the family of Windows™ operating systems brought many changes to the development environment and more than a few changes to applications themselves. One of the more significant changes is the introduction of the Portable Executable (PE) file format. The new PE file format draws primarily from the COFF (Common Object File Format) specification that is common to UNIX® operating systems. Yet, to remain compatible with previous versions of the MS-DOS® and Windows operating systems, the PE file format also retains the old familiar MZ header from MS-DOS.

In this article, the PE file format is explained using a top-down approach. This article discusses each of the components of the file as they occur when you traverse the file's contents, starting at the top and working your way down through the file.

Much of the definition of individual file components comes from the file WINNT.H, a file included in the Microsoft Win32™ Software Development Kit (SDK) for Windows NT. In it you will find structure type definitions for each of the file headers and data directories used to represent various components in the file. In other places in the file, WINNT.H lacks sufficient definition of the file structure. In these places, I chose to define my own structures that can be used to access the data from the file. You will find these structures defined in PEFILE.H, a file used to create the PEFILE.DLL. The entire suite of PEFILE.H development files is included in the PEFile sample application.

In addition to the PEFILE.DLL sample code, a separate Win32-based sample application called EXEVIEW.EXE accompanies this article. This sample was created for two purposes: First, I needed a way to be able to test the PEFILE.DLL functions, which in some cases required multiple file views simultaneously—hence the multiple view support. Second, much of the work of figuring out PE file format involved being able to see the data interactively. For example, to understand how the import address name table is structured, I had to view the .idata section header, the import image data directory, the optional header, and the actual .idata section body, all simultaneously. EXEVIEW.EXE is the perfect sample for viewing that information.

Without further ado, let's begin.

152/168

PEFILE.C

[cpp] view plain copy

 

  1. int  WINAPI GetExportFunctionNames (  
  2.     LPVOID    lpFile,  
  3.     HANDLE    hHeap,  
  4.     char      **pszFunctions)  
  5. {  
  6.     IMAGE_SECTION_HEADER       sh;  
  7.     PIMAGE_EXPORT_DIRECTORY    ped;  
  8.     char                       *pNames, *pCnt;  
  9.     int                        i, nCnt;  
  10.   
  11.     /* Get section header and pointer to data directory  
  12.        for .edata section. */  
  13.     if ((ped = (PIMAGE_EXPORT_DIRECTORY)ImageDirectoryOffset  
  14.             (lpFile, IMAGE_DIRECTORY_ENTRY_EXPORT)) == NULL)  
  15.         return 0;  
  16.     GetSectionHdrByName (lpFile, &sh, ".edata");  
  17.   
  18.     /* Determine the offset of the export function names. */  
  19.     pNames = (char *)(*(int *)((int)ped->AddressOfNames -  
  20.                                (int)sh.VirtualAddress   +  
  21.                                (int)sh.PointerToRawData +  
  22.                                (int)lpFile)    -  
  23.                       (int)sh.VirtualAddress   +  
  24.                       (int)sh.PointerToRawData +  
  25.                       (int)lpFile);  
  26.   
  27.     /* Figure out how much memory to allocate for all strings. */  
  28.     pCnt = pNames;  
  29.     for (i=0; i<(int)ped->NumberOfNames; i++)  
  30.         while (*pCnt++);  
  31.     nCnt = (int)(pCnt. pNames);  
  32.   
  33.     /* Allocate memory off heap for function names. */  
  34.     *pszFunctions = HeapAlloc (hHeap, HEAP_ZERO_MEMORY, nCnt);  
  35.   
  36.     /* Copy all strings to buffer. */  
  37.     CopyMemory ((LPVOID)*pszFunctions, (LPVOID)pNames, nCnt);  
  38.   
  39.     return nCnt;  
  40. }  

Notice that in this function the variable pNames is assigned by determining first the address of the offset and then the actual offset location. Both the address of the offset and the offset itself are relative virtual addresses and must be translated before being used, as the function demonstrates. You could write a similar function to determine the ordinal values or entry points of the functions, but why bother when I already did this for you? The GetNumberOfExportedFunctions , GetExportFunctionEntryPoints , and GetExportFunctionOrdinals functions also exist in the PEFILE.DLL.

Offset

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_FILE_HEADER {  
  2.     USHORT  Machine;  
  3.     USHORT  NumberOfSections;  
  4.     ULONG   TimeDateStamp;  
  5.     ULONG   PointerToSymbolTable;  
  6.     ULONG   NumberOfSymbols;  
  7.     USHORT  SizeOfOptionalHeader;  
  8.     USHORT  Characteristics;  
  9. } IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;  
  10.   
  11. #define IMAGE_SIZEOF_FILE_HEADER             20  

Notice that the size of the file header structure is conveniently defined in the include file. This makes it easy to get the size of the structure, but I found it easier to use the sizeof function on the structure itself because it does not require me to remember the name of the constant IMAGE_SIZEOF_FILE_HEADER in addition to the IMAGE_FILE_HEADER structure name itself. On the other hand, remembering the name of all the structures proved challenging enough, especially since none of these structures is documented anywhere except in the WINNT.H include file.

The information in the PE file is basically high-level information that is used by the system or applications to determine how to treat the file. The first field is used to indicate what type of machine the executable was built for, such as the DEC® Alpha, MIPS R4000, Intel® x86, or some other processor. The system uses this information to quickly determine how to treat the file before going any further into the rest of the file data.

The Characteristics field identifies specific characteristics about the file. For example, consider how separate debug files are managed for an executable. It is possible to strip debug information from a PE file and store it in a debug file (.DBG) for use by debuggers. To do this, a debugger needs to know whether to find the debug information in a separate file or not and whether the information has been stripped from the file or not. A debugger could find out by drilling down into the executable file looking for debug information. To save the debugger from having to search the file, a file characteristic that indicates that the file has been stripped (IMAGE_FILE_DEBUG_STRIPPED) was invented. Debuggers can look in the PE file header to quickly determine whether the debug information is present in the file or not.

WINNT.H defines several other flags that indicate file header information much the way the example described above does. I'll leave it as an exercise for the reader to look up the flags to see if any of them are interesting or not. They are located in WINNT.H immediately after the IMAGE_FILE_HEADER structure described above.

One other useful entry in the PE file header structure is the NumberOfSections field. It turns out that you need to know how many sections—more specifically, how many section headers and section bodies—are in the file in order to extract the information easily. Each section header and section body is laid out sequentially in the file, so the number of sections is necessary to determine where the section headers and bodies end. The following function extracts the number of sections from the PE file header:

#define IMAGE_DIRECTORY_ENTRY_EXPORT          0   // Export Directory
#define IMAGE_DIRECTORY_ENTRY_IMPORT          1   // Import Directory
#define IMAGE_DIRECTORY_ENTRY_RESOURCE        2   // Resource Directory
#define IMAGE_DIRECTORY_ENTRY_EXCEPTION       3   // Exception Directory
#define IMAGE_DIRECTORY_ENTRY_SECURITY        4   // Security Directory
#define IMAGE_DIRECTORY_ENTRY_BASERELOC       5   // Base Relocation Table
#define IMAGE_DIRECTORY_ENTRY_DEBUG           6   // Debug Directory
//      IMAGE_DIRECTORY_ENTRY_COPYRIGHT       7   // (X86 usage)
#define IMAGE_DIRECTORY_ENTRY_ARCHITECTURE    7   // Architecture Specific Data
#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR       8   // RVA of GP
#define IMAGE_DIRECTORY_ENTRY_TLS             9   // TLS Directory
#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10   // Load Configuration Directory
#define IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT   11   // Bound Import Directory in headers
#define IMAGE_DIRECTORY_ENTRY_IAT            12   // Import Address Table
#define IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT   13   // Delay Load Import Descriptors
#define IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR 14   // COM Runtime descriptor

WINNT.H

[cpp] view plain copy

 

  1. #define IMAGE_SIZEOF_SHORT_NAME              8  
  2.   
  3. typedef struct _IMAGE_SECTION_HEADER {  
  4.     UCHAR   Name[IMAGE_SIZEOF_SHORT_NAME];  
  5.     union {  
  6.             ULONG   PhysicalAddress;  
  7.             ULONG   VirtualSize;  
  8.     } Misc;  
  9.     ULONG   VirtualAddress;  
  10.     ULONG   SizeOfRawData;  
  11.     ULONG   PointerToRawData;  
  12.     ULONG   PointerToRelocations;  
  13.     ULONG   PointerToLinenumbers;  
  14.     USHORT  NumberOfRelocations;  
  15.     USHORT  NumberOfLinenumbers;  
  16.     ULONG   Characteristics;  
  17. } IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;  

How do you Go about getting section header information for a particular section? Since section headers are organized sequentially in no specific order, section headers must be located by name. The following function shows how to retrieve a section header from a PE image file given the name of the section:

Debug

Summary of the PE File Format

The PE file format for Windows NT introduces a completely new structure to developers familiar with the Windows and MS-DOS environments. Yet developers familiar with the UNIX environment will find that the PE file format is similar to, if not based on, the COFF specification.

The entire format consists of an MS-DOS MZ header, followed by a real-mode stub program, the PE file signature, the PE file header, the PE optional header, all of the section headers, and finally, all of the section bodies.

The optional header ends with an array of data directory entries that are relative virtual addresses to data directories contained within section bodies. Each data directory indicates how a specific section body's data is structured.

The PE file format has eleven predefined sections, as is common to applications for Windows NT, but each application can define its own unique sections for code and data.

The .debug predefined section also has the capability of being stripped from the file into a separate debug file. If so, a special debug header is used to parse the debug file, and a flag is specified in the PE file header to indicate that the debug data has been stripped.

Architecture

PEFILE.C

[cpp] view plain copy

 

  1. LPVOID  WINAPI ImageDirectoryOffset (  
  2.         LPVOID    lpFile,  
  3.         DWORD     dwIMAGE_DIRECTORY)  
  4. {  
  5.     PIMAGE_OPTIONAL_HEADER   poh;  
  6.     PIMAGE_SECTION_HEADER    psh;  
  7.     int                      nSections = NumOfSections (lpFile);  
  8.     int                      i = 0;  
  9.     LPVOID                   VAImageDir;  
  10.   
  11.     /* Must be 0 thru (NumberOfRvaAndSizes-1). */  
  12.     if (dwIMAGE_DIRECTORY >= poh->NumberOfRvaAndSizes)  
  13.         return NULL;  
  14.   
  15.     /* Retrieve offsets to optional and section headers. */  
  16.     poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);  
  17.     psh = (PIMAGE_SECTION_HEADER)SECHDROFFSET (lpFile);  
  18.   
  19.     /* Locate image directory's relative virtual address. */  
  20.     VAImageDir = (LPVOID)poh->DataDirectory  
  21.                        [dwIMAGE_DIRECTORY].VirtualAddress;  
  22.   
  23.     /* Locate section containing image directory. */  
  24.     while (i++<nSections)  
  25.         {  
  26.         if (psh->VirtualAddress <= (DWORD)VAImageDir &&  
  27.             psh->VirtualAddress +   
  28.                  psh->SizeOfRawData > (DWORD)VAImageDir)  
  29.             break;  
  30.         psh++;  
  31.         }  
  32.   
  33.     if (i > nSections)  
  34.         return NULL;  
  35.   
  36.     /* Return image import directory offset. */  
  37.     return (LPVOID)(((int)lpFile +   
  38.                      (int)VAImageDir. psh->VirtualAddress) +  
  39.                     (int)psh->PointerToRawData);  
  40. }  

The function begins by validating the requested data directory entry number. Then it retrieves pointers to the optional header and first section header. From the optional header, the function determines the data directory's virtual address, and it uses this value to determine within which section body the data directory is located. Once the appropriate section body has been identified, the specific location of the data directory is found by translating the relative virtual address of the data directory to a specific address into the file.

微软官方文档:https://msdn.microsoft.com/en-us/library/windows/desktop/ms680339(v=vs.85).aspx.aspx)

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_EXPORT_DIRECTORY {  
  2.     ULONG   Characteristics;  
  3.     ULONG   TimeDateStamp;  
  4.     USHORT  MajorVersion;  
  5.     USHORT  MinorVersion;  
  6.     ULONG   Name;  
  7.     ULONG   Base;  
  8.     ULONG   NumberOfFunctions;  
  9.     ULONG   NumberOfNames;  
  10.     PULONG  *AddressOfFunctions;  
  11.     PULONG  *AddressOfNames;  
  12.     PUSHORT *AddressOfNameOrdinals;  
  13. } IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;  

The Name field in the export directory identifies the name of the executable module. NumberOfFunctions and NumberOfNames fields indicate how many functions and function names are being exported from the module.

The AddressOfFunctions field is an offset to a list of exported function entry points. The AddressOfNames field is the address of an offset to the beginning of a null-separated list of exported function names. AddressOfNameOrdinals is an offset to a list of ordinal values (each 2 bytes long) for the same exported functions.

The three AddressOf... fields are relative virtual addresses into the address space of a process once the module has been loaded. Once the module is loaded, the relative virtual address should be added to the module base address to get the exact location in the address space of the process. Before the file is loaded, however, the address can be determined by subtracting the section header virtual address (VirtualAddress ) from the given field address, adding the section body offset (PointerToRawData ) to the result, and then using this value as an offset into the image file. The following example illustrates this technique:

Global Ptr

Export data section, .edata

The .edata section contains export data for an application or DLL. When present, this section contains an export directory for getting to the export information.

Description

PE Optional Header

The next 224 bytes in the executable file make up the PE optional header. Though its name is "optional header," rest assured that this is not an optional entry in PE executable files. A pointer to the optional header is obtained with the OPTHDROFFSET macro:

Resource Table

PEFILE.C

#define PEFHDROFFSET(a) ((LPVOID)((BYTE *)a +  /
    ((PIMAGE_DOS_HEADER)a)->e_lfanew + SIZE_OF_NT_SIGNATURE))

The only difference between this and the previous macro is that this one adds in the constant SIZE_OF_NT_SIGNATURE. Sad to say, this constant is not defined in WINNT.H, but is instead one I defined in PEFILE.H as the size of a DWORD.

Now that we know the location of the PE file header, we can examine the data in the header simply by assigning this location to a structure, as in the following example:

PIMAGE_FILE_HEADER   pfh;

pfh = (PIMAGE_FILE_HEADER)PEFHDROFFSET (lpFile);

I n this example, lpFile represents a pointer to the base of the memory-mapped executable file, and therein lies the convenience of memory-mapped files. No file I/O needs to be performed; simply dereference the pointer pfh to access information in the file. The PE file header structure is defined as:

给出说明:

PEFILE.C

[cpp] view plain copy

 

  1. int    WINAPI RetrieveModuleName (  
  2.     LPVOID    lpFile,  
  3.     HANDLE    hHeap,  
  4.     char      **pszModule)  
  5. {  
  6.   
  7.     PIMAGE_DEBUG_DIRECTORY    pdd;  
  8.     PIMAGE_DEBUG_MISC         pdm = NULL;  
  9.     int                       nCnt;  
  10.   
  11.     if (!(pdd = (PIMAGE_DEBUG_DIRECTORY)ImageDirectoryOffset  
  12.                (lpFile, IMAGE_DIRECTORY_ENTRY_DEBUG)))  
  13.         return 0;  
  14.   
  15.     while (pdd->SizeOfData)  
  16.         {  
  17.         if (pdd->Type == IMAGE_DEBUG_TYPE_MISC)  
  18.             {  
  19.             pdm = (PIMAGE_DEBUG_MISC)  
  20.                 ((DWORD)pdd->PointerToRawData + (DWORD)lpFile);  
  21.   
  22.             nCnt = lstrlen (pdm->Data)*(pdm->Unicode?2:1);  
  23.             *pszModule = (char *)HeapAlloc (hHeap,  
  24.                                             HEAP_ZERO_MEMORY,  
  25.                                             nCnt+1;  
  26.             CopyMemory (*pszModule, pdm->Data, nCnt);  
  27.   
  28.             break;  
  29.             }  
  30.   
  31.         pdd ++;  
  32.         }  
  33.   
  34.     if (pdm != NULL)  
  35.         return nCnt;  
  36.     else  
  37.         return 0;  
  38. }  

As you can see, the structure of the debug directory makes it relatively easy to locate a specific type of debug information. Once the IMAGE_DEBUG_MISC structure is located, extracting the image name is as simple as invoking the CopyMemory function.

As mentioned above, debug information can be stripped into separate .DBG files. The Windows NT SDK includes a utility called REBASE.EXE that serves this purpose. For example, in the following statement an executable image named TEST.EXE is being stripped of debug information:

rebase -b 40000 -x c:/samples/testdir test.exe

The debug information is placed in a new file called TEST.DBG and located in the path specified, in this case c:/samples/testdir. The file begins with a single IMAGE_SEPARATE_DEBUG_HEADER structure, followed by a copy of the section headers that exist in the stripped executable image. Then the .debug section data follows the section headers. So, right after the section headers are the series of IMAGE_DEBUG_DIRECTORY structures and their associated data. The debug information itself retains the same structure as described above for normal image file debug information.

这个字段可以说是最重要的字段之一,它由16个相同的IMAGE_DATA_DIRECTORY结构组成。其结构如下:

PEFILE.C

[cpp] view plain copy

 

  1. int     WINAPI GetListOfResourceTypes (  
  2.     LPVOID    lpFile,  
  3.     HANDLE    hHeap,  
  4.     char      **pszResTypes)  
  5. {  
  6.     PIMAGE_RESOURCE_DIRECTORY          prdRoot;  
  7.     PIMAGE_RESOURCE_DIRECTORY_ENTRY    prde;  
  8.     char                               *pMem;  
  9.     int                                nCnt, i;  
  10.   
  11.   
  12.     /* Get root directory of resource tree. */  
  13.     if ((prdRoot = PIMAGE_RESOURCE_DIRECTORY)ImageDirectoryOffset  
  14.            (lpFile, IMAGE_DIRECTORY_ENTRY_RESOURCE)) == NULL)  
  15.         return 0;  
  16.   
  17.     /* Allocate enough space from heap to cover all types. */  
  18.     nCnt = prdRoot->NumberOfIdEntries * (MAXRESOURCENAME + 1);  
  19.     *pszResTypes = (char *)HeapAlloc (hHeap,  
  20.                                       HEAP_ZERO_MEMORY,  
  21.                                       nCnt);  
  22.     if ((pMem = *pszResTypes) == NULL)  
  23.         return 0;  
  24.   
  25.     /* Set pointer to first resource type entry. */  
  26.     prde = (PIMAGE_RESOURCE_DIRECTORY_ENTRY)((DWORD)prdRoot +  
  27.                sizeof (IMAGE_RESOURCE_DIRECTORY));  
  28.   
  29.     /* Loop through all resource directory entry types. */  
  30.     for (i=0; i<prdRoot->NumberOfIdEntries; i++)  
  31.         {  
  32.         if (LoadString (hDll, prde->Name, pMem, MAXRESOURCENAME))  
  33.             pMem += strlen (pMem) + 1;  
  34.   
  35.         prde++;  
  36.         }  
  37.   
  38.     return nCnt;  
  39. }  

This function returns a list of resource type names in the string identified by pszResTypes . Notice that, at the heart of this function, LoadString is called using the Name field of each resource type directory entry as the string ID. If you look in the PEFILE.RC, you'll see that I defined a series of resource type strings whose IDs are defined the same as the type specifiers in the directory entries. There is also a function in PEFILE.DLL that returns the total number of resource objects in the .rsrc section. It would be rather easy to expand on these functions or write new functions that extracted other information from this section.

The debug data starting address and size. For more information, see section 6.1, “The .debug Section.”

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_DATA_DIRECTORY {  
  2.     ULONG   VirtualAddress;  
  3.     ULONG   Size;  
  4. } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;  

Each data directory entry specifies the size and relative virtual address of the directory. To locate a particular directory, you determine the relative address from the data directory array in the optional header. Then use the virtual address to determine which section the directory is in. Once you determine which section contains the directory, the section header for that section is then used to find the exact file offset location of the data directory.

So to get a data directory, you first need to know about sections, which are described next. An example of how to locate data directories immediately follows this discussion.

184/200

PEFILE.C

[cpp] view plain copy

 

  1. LPVOID  WINAPI GetModuleEntryPoint (  
  2.     LPVOID    lpFile)  
  3. {  
  4.     PIMAGE_OPTIONAL_HEADER   poh;  
  5.   
  6.     poh = (PIMAGE_OPTIONAL_HEADER)OPTHDROFFSET (lpFile);  
  7.   
  8.     if (poh != NULL)  
  9.         return (LPVOID)poh->AddressOfEntryPoint;  
  10.     else  
  11.         return NULL;  
  12. }  
  • BaseOfCode . Relative offset of code (".text" section) in loaded image.
  • BaseOfData . Relative offset of uninitialized data (".bss" section) in loaded image.

ImageBase  ***(必须了解)***

The Portable Executable File Format from Top to Bottom

Randy Kath
Microsoft Developer Network Technology Group

Created: June 12, 1993

 

Click to open or copy the files in the EXEVIEW sample application for this technical article.

Click to open or copy the files in the PEFILE sample application for this technical article.

8

Windows NT Additional Fields

The additional fields added to the Windows NT PE file format provide loader support for much of the Windows NT–specific process behavior. Following is a summary of these fields.

  • ImageBase . Preferred base address in the address space of a process to map the executable image to. The linker that comes with the Microsoft Win32 SDK for Windows NT defaults to 0x00400000, but you can override the default with the -BASE: linker switch.
  • SectionAlignment . Each section is loaded into the address space of a process sequentially, beginning at ImageBase . SectionAlignment dictates the minimum amount of space a section can occupy when loaded—that is, sections are aligned on SectionAlignment boundaries.

    Section alignment can be no less than the page size (currently 4096 bytes on the x 86 platform) and must be a multiple of the page size as dictated by the behavior of Windows NT's virtual memory manager. 4096 bytes is the x 86 linker default, but this can be set using the -ALIGN: linker switch.

  • FileAlignment . Minimum granularity of chunks of information within the image file prior to loading. For example, the linker zero-pads a section body (raw data for a section) up to the nearest FileAlignment boundary in the file. Version 2.39 of the linker mentioned earlier aligns image files on a 0x200-byte granularity. This value is constrained to be a power of 2 between 512 and 65,535.

  • MajorOperatingSystemVersion . Indicates the major version of the Windows NT operating system, currently set to 1 for Windows NT version 1.0.
  • MinorOperatingSystemVersion . Indicates the minor version of the Windows NT operating system, currently set to 0 for Windows NT version 1.0
  • MajorImageVersion . Used to indicate the major version number of the application; in Microsoft Excel version 4.0, it would be 4.
  • MinorImageVersion . Used to indicate the minor version number of the application; in Microsoft Excel version 4.0, it would be 0.
  • MajorSubsystemVersion . Indicates the Windows NT Win32 subsystem major version number, currently set to 3 for Windows NT version 3.10.
  • MinorSubsystemVersion . Indicates the Windows NT Win32 subsystem minor version number, currently set to 10 for Windows NT version 3.10.
  • Reserved1 . Unknown purpose, currently not used by the system and set to zero by the linker.
  • SizeOfImage . Indicates the amount of address space to reserve in the address space for the loaded executable image. This number is influenced greatly by SectionAlignment . For example, consider a system having a fixed page size of 4096 bytes. If you have an executable with 11 sections, each less than 4096 bytes, aligned on a 65,536-byte boundary, the SizeOfImage field would be set to 11 * 65,536 = 720,896 (176 pages). The same file linked with 4096-byte alignment would result in 11 * 4096 = 45,056 (11 pages) for the SizeOfImage field. This is a simple example in which each section requires less than a page of memory. In reality, the linker determines the exact SizeOfImage by figuring each section individually. It first determines how many bytes the section requires, then it rounds up to the nearest page boundary, and finally it rounds page count to the nearest SectionAlignment boundary. The total is then the sum of each section's individual requirement.
  • SizeOfHeaders . This field indicates how much space in the file is used for representing all the file headers, including the MS-DOS header, PE file header, PE optional header, and PE section headers. The section bodies begin at this location in the file.
  • CheckSum . A checksum value is used to validate the executable file at load time. The value is set and verified by the linker. The algorithm used for creating these checksum values is proprietary information and will not be published.
  • Subsystem . Field used to identify the target subsystem for this executable. Each of the possible subsystem values are listed in the WINNT.H file immediately after the IMAGE_OPTIONAL_HEADER structure.
  • DllCharacteristics . Flags used to indicate if a DLL image includes entry points for process and thread initialization and termination.
  • SizeOfStackReserve , SizeOfStackCommit , SizeOfHeapReserve , SizeOfHeapCommit . These fields control the amount of address space to reserve and commit for the stack and default heap. Both the stack and heap have default values of 1 page committed and 16 pages reserved. These values are set with the linker switches -STACKSIZE: and -HEAPSIZE: .
  • LoaderFlags . Tells the loader whether to break on load, debug on load, or the default, which is to let things run normally.
  • NumberOfRvaAndSizes . This field identifies the length of the DataDirectory array that follows. It is important to note that this field is used to identify the size of the array, not the number of valid entries in the array.
  • DataDirectory . The data directory indicates where to find other important components of executable information in the file. It is really nothing more than an array of IMAGE_DATA_DIRECTORY structures that are located at the end of the optional header structure. The current PE file format defines 16 possible data directories, 11 of which are now being used.

Reserved, must be 0

Structure of PE Files

The PE file format is organized as a linear stream of data. It begins with an MS-DOS header, a real-mode program stub, and a PE file signature. Immediately following is a PE file header and optional header. Beyond that, all the section headers appear, followed by all of the section bodies. Closing out the file are a few other regions of miscellaneous information, including relocation information, symbol table information, line number information, and string table data. All of this is more easily absorbed by looking at it graphically, as shown in Figure 1.

图片 2

Figure 1. Structure of a Portable Executable file image

Starting with the MS-DOS file header structure, each of the components in the PE file format is discussed below in the order in which it occurs in the file. Much of this discussion is based on sample code that demonstrates how to get to the information in the file. All of the sample code is taken from the file PEFILE.C, the source module for PEFILE.DLL. Each of these examples takes advantage of one of the coolest features of Windows NT, memory-mapped files. Memory-mapped files permit the use of simple pointer dereferencing to access the data contained within the file. Each of the examples uses memory-mapped files for accessing data in PE files.

**Note **   Refer to the section at the end of this article for a discussion on how to use PEFILE.DLL.

The attribute certificate table address and size. For more information, see section 5.7, “The Attribute Certificate Table (Image Only).”

MS-DOS/Real-Mode Header

As mentioned above, the first component in the PE file format is the MS-DOS header. The MS-DOS header is not new for the PE file format. It is the same MS-DOS header that has been around since version 2 of the MS-DOS operating system. The main reason for keeping the same structure intact at the beginning of the PE file format is so that, when you attempt to load a file created under Windows version 3.1 or earlier, or MS DOS version 2.0 or later, the operating system can read the file and understand that it is not compatible. In other words, when you attempt to run a Windows NT executable on MS-DOS version 6.0, you get this message: "This program cannot be run in DOS mode." If the MS-DOS header was not included as the first part of the PE file format, the operating system would simply fail the attempt to load the file and offer something completely useless, such as: "The name specified is not recognized as an internal or external command, operable program or batch file."

The MS-DOS header occupies the first 64 bytes of the PE file. A structure representing its content is described below:

WINNT.H

[cpp] view plain copy

 

  1. typedef struct _IMAGE_DOS_HEADER {  // DOS .EXE header  
  2.     USHORT e_magic;         // Magic number  
  3.     USHORT e_cblp;          // Bytes on last page of file  
  4.     USHORT e_cp;            // Pages in file  
  5.     USHORT e_crlc;          // Relocations  
  6.     USHORT e_cparhdr;       // Size of header in paragraphs  
  7.     USHORT e_minalloc;      // Minimum extra paragraphs needed  
  8.     USHORT e_maxalloc;      // Maximum extra paragraphs needed  
  9.     USHORT e_ss;            // Initial (relative) SS value  
  10.     USHORT e_sp;            // Initial SP value  
  11.     USHORT e_csum;          // Checksum  
  12.     USHORT e_ip;            // Initial IP value  
  13.     USHORT e_cs;            // Initial (relative) CS value  
  14.     USHORT e_lfarlc;        // File address of relocation table  
  15.     USHORT e_ovno;          // Overlay number  
  16.     USHORT e_res[4];        // Reserved words  
  17.     USHORT e_oemid;         // OEM identifier (for e_oeminfo)  
  18.     USHORT e_oeminfo;       // OEM information; e_oemid specific  
  19.     USHORT e_res2[10];      // Reserved words  
  20.     LONG   e_lfanew;        // File address of new exe header  
  21.   } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;  

The first field, e_magic , is the so-called magic number. This field is used to identify an MS-DOS–compatible file type. All MS-DOS–compatible executable files set this value to 0x54AD, which represents the ASCII characters MZ . MS-DOS headers are sometimes referred to as MZ headers for this reason. Many other fields are important to MS-DOS operating systems, but for Windows NT, there is really one more important field in this structure. The final field, e_lfanew , is a 4-byte offset into the file where the PE file header is located. It is necessary to use this offset to locate the PE header in the file. For PE files in Windows NT, the PE file header occurs soon after the MS-DOS header with only the real-mode stub program between them.

SectionAlignment ***(必须了解)***

Section Headers

Section headers are located sequentially right after the optional header in the PE file format. Each section header is 40 bytes with no padding between them. Section headers are defined as in the following structure:

本文由巴黎人手机版发布于巴黎人-操作系统,转载请注明出处:程序开始执行的地址,程序开始执行的地址

上一篇:没有了 下一篇:异常处理的目的就是为了提高程序的安全性与健
猜你喜欢
热门排行
精彩图文