SPO600 – Lab 4 – Codebase Analysis

Introduction
In this lab we are downloading a rpm package and searching it for assembly language. We will be looking for .s and .S files, along with searching inside .c files for in-line assembly. It’s also good to search for .cpp, .asm, and .h files, as these are other files that may contain platform specific code. Once all the assembly is believed to be found, we will analyse it, to determine what its purpose is: performance, memory barriers, atomics, and low-level interactions.

Sometimes assembly is not the only thing that is the culprit of platform specific code. If a assembly search comes up bare, there are intrinsic code, which can lock the program down to a specific compiler, and even a platform. In another program analysis I have run into SIMD for vectors, which was written in C++, but only compiled on specific architectures.

The package I will be search is “syslinux”, which consists of many lightweight bootloaders for many different file systems. It also includes a tool for booting legacy operating systems on different media. After looking through the spec file it is clear that this package is only for X86 processors, but it looks like it will run on both 32bit and 64bit.

ExclusiveArch: %{ix86} x86_64

Getting the Package
To start my search I downloaded the package and started searching through the source(This was performed on fedora 19).

Download Package and Source:

fedpkg clone -a syslinux
cd syslinux
fedpkg prep
cd packagename-version

The Search
Objective: Find all files that have assembly within

Gather a list of files to search:

find ./ | egrep -i "\.s$|\.asm$|\.c$|\.cpp$|\.h$|\.cc$" >> ~/spo600-package1/files.txt

Now we have a file “files.txt” which contains the name of all files that might have something to do with assembly. All the files with the “.s”, “.S”, and “.asm” all are probably assembly language source files. However, assembly can also be called through in-line assembly, so we will search through all “.c”, “.cc”, “.cpp”, and “.h” files for the use of in-line assembly.

Search for in-line assembly:

grep "asm(" $(cat ~/files.txt)
grep "__asm" $(cat ~/files.txt)

Now we have a list of all files that contain in-line assembly, as well as a list of all the source assembly files found earlier. Now it is time to look and see what they are doing.

Analysis
First we will start with the assembly source files “.S” extensions. These files can be found here. Files that seem to come up often are:

memset.S
memcpy.S
memmove.S

There are multiple versions of these files in different directory structures. The directories that these files are in seem to represent different file systems and programs.

The ./dos/memcpy.S program:

#
# memcpy.S
#
# Simple 16-bit memcpy() implementation
#

 .text
 .code16gcc
 .globl memcpy
 .type memcpy, @function
memcpy:
 cld
 pushw %di
 pushw %si
 movw %ax,%di
 movw %dx,%si
 # The third argument is already in cx
 rep ; movsb
 popw %si
 popw %di
 ret

 .size memcpy,.-memcpy

The memcpy program seems like it runs as a function from another program. This program is used to copy memory from source to destination of a specified size. First it clears the direction flag “cld”, which is used in the “rep” instruction. Next it saves some registers on the stack so it can put them back when it’s done using them. It places the source and destination arguments in the registers, then runs the following instructions “rep ; movsb”. The “rep” instruction will repeat the next instruction multiple times, equal to the value in %cx, which happens to be the third argument. The “movsb” instruction will move a single byte from the source and destination registers, then increment(or decrement) the location in memory based on the direction set(“cld”). After copying the memory, byte by byte, to the destination, it returns the values it saved on the stack and ends.

Advertisements

About oatleywillisa

Computer Networking Student
This entry was posted in SBR600 and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s