Add the submitted Interim Checkpoint source file in doc folder

This commit is contained in:
sBubshait 2024-06-16 19:23:32 +01:00
parent dd7ec0c234
commit 637a5d67a3

View File

@ -1,30 +1,84 @@
\documentclass[11pt]{article}
\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage[margin=2cm]{geometry}
\setlength{\parindent}{0pt}
\begin{document}
\title{ARM Checkpoint... }
\author{TODO}
\title{\scshape{\vspace{-2cm}ARMv8 Checkpoint}}
\author{\textbf{Group \#43}\\ Saleh Bubshait, Themis Demetriades, Ethan Dias Alberto, George Niedringhaus}
\date{7 June 2024}
\maketitle
\section{Group Organisation}
Our group has decided to divide the project into two main components: the emulator
and the assembler. To manage this, we have paired up, with Saleh and Themis working
on the emulator, and Ethan and George working on the assembler. Prior to commencing
the project, we convened to outline the project's overall structure. making key decisions,
including the memory and machine layout, two-pass assembler, and using the Internal
Representation (IR) approach in both parts. To ensure timely completion, we established a detailed timeline for each part, aligned with the final deadline. We also committed to meet twice a week to discuss our progress and ensure adherence to our schedule. For collaboration, we used Google Docs to share ideas and updates. Although we initially planned to use pair programming across the teams, we soon discovered it was less efficient. It proved more effective to review and scrutinize entire completed modules rather than working simultaneously. Consequently, we transitioned to individual work, utilising GitLab branches for code sharing to enhance overall efficiency.\\
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
Overall, our team organization proved effective for the task. The independence of the emulator and assembler components allowed each sub-team to work simultaneously, reducing scheduling conflicts and minimising codebase dependencies that could delay progress. Pairing members on interdependent components ensured high coding standards through familiarity with the codebase and conventions, facilitating seamless integration. It also enabled more concentrated efforts on each part facilitating faster completion of individual components, as seen with the emulator being nearly complete by the time of writing this report.\\
\section{Implementation Strategies}
Despite these advantages, The separation resulted in discrepancies in coding styles between the emulator and assembler, which caused readability issues and slowed down integration. Code repetition became inevitable due to communication difficulties arising from working on weakly connected halves. This disconnectedness also created uncertainty within the team, with some members unsure of their roles and responsibilities, leading to inefficiencies in progress. Moving forward, to address these issues, we plan to adopt a more integrated approach for the remaining parts of the project. We will work as a single group to ensure consistent coding standards and improve overall communication. Increasing the frequency of our meetings will help us stay on track and allow for more immediate feedback and adjustments. Establishing a unified coding style from the beginning will also mitigate the readability problems we encountered. By making these changes, we aim to enhance collaboration and efficiency, ensuring smoother progress and better quality in our future work.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
\end{document}
\section{Emulator Implementation}
We have structured the emulator such that each stage of emulation has a separate module that can be written and tested separately, abstracting away specifics to allow for reusability. These modules consist of:
\begin{itemize}
\itemsep0em
\item \texttt{emulate}: This serves as the program entry point, housing only the main function. It loads binary files into the memory using \texttt{fileio}, initializes the machine state, emulates each fetched instruction using \texttt{decode} and \texttt{execute}, prints the machine state using \texttt{print}, and finally frees memory.
\item \texttt{fileio}: A binary file loader that can load arbitrary binary files to an arbitrarily sized heap chunk with appropriate error checking.
\item \texttt{a64instruction}: A module that defines the Internal Representation (IR) of instructions in the Arm64 architecture.
\item \texttt{decode}: A module that translates a given binary instruction into the internal representation.
\item \texttt{execute}: A module that updates a given processor state based on the IR of an instruction simulating the execution of the instruction.
\item \texttt{print}: A module that prints the current state of the machine to a given character stream in a format that is easy to read and test.
\item \texttt{machineutil}: A module that contains utility functions related to the machine state, such as functions to initialise the machine state, read and write to memory, read and write to registers, and update the condition codes.
\item \texttt{binaryutil}: A module that contains utility functions related to binary manipulation, such as functions to extract specific bits from a binary value, sign-extend a binary value, and rotate a binary value.
\end{itemize}
One module worth highlighting is the \texttt{a64instruction} module, defining the IR of instructions in the Arm64 architecture. Instructions are represented using a struct data type containing an enumeration specifying the instruction type and a union field holding data specific to each instruction type, as depicted in Figure 2. This design facilitates easy identification of instruction types and error detection. Additionally, by employing a hierarchical structure for struct types holding instruction data, shared information among instruction types is managed efficiently. This minimizes code duplication in both decoding and execution modules, streamlining the process from shared to unique fields during instruction decoding and execution.\\
Each utility function was tested thoroughly using unit tests, and the overall emulator was tested using the provided test suite. We have developed a script \texttt{test.sh} that runs all these tests along with memory checking using Valgrind.\\
There are many services provided by these modules that can be reused in the assembler: for one, the assembler will benefit from being able to use the internal representation as an intermediary step between assembly and binary. This also means that it will be able to reuse \texttt{binaryutil} helper functions and \texttt{decode} constants to map fields in the internal representation to bit positions in the final binary (as the decode module performs the exact inverse operation). Furthermore, general binary manipulation such as performing sign-extension on values or retrieving specific bit ranges will almost certainly be necessary, and these functions have already been implemented and tested. Finally, implementation of the assembler will benefit from existing data types, aliases, and helper functions that store data and perform arithmetic and logical instructions in a platform independent manner (such as by accounting for endianness).
\begin{figure}[!htb]
\begin{minipage}{0.45\textwidth}
\begin{verbatim}
typedef struct {
dword registers[REGISTER_COUNT];
dword pc;
byte *memory;
PState condition_codes;
} Machine;
\end{verbatim}
\caption{Processor state representation}
\end{minipage}
\hfill
\begin{minipage}{0.6\textwidth}
\begin{verbatim}
typedef struct {
a64inst_type type;
union {
a64inst_DPImmediateData DPImmediateData;
a64inst_DPRegisterData DPRegisterData;
a64inst_BranchData BranchData;
a64inst_SingleTransferData SingleTransferData;
} data;
} a64inst_instruction;
\end{verbatim}
\caption{Internal representation of processor instructions}
\end{minipage}
\end{figure}
\vspace*{-0.5cm}
\section{Implementation Challenges}
Looking ahead, we anticipate several challenges as we progress to the later stages of the project. Mapping assembly code to binary values may prove challenging, requiring careful consideration and potential remapping efforts. However, by reusing the Internal Representation of instructions, we can streamline this process, avoiding cryptic numerical values and excessive remapping. Another significant challenge we forsee is the potential for debugging to become more complex and time-consuming, compounded by the risk of memory leaks and segmentation faults as we manipulate data structures and manage memory dynamically. To mitigate this, we will implement thorough testing procedures, including unit tests and integration tests, and conduct regular code reviews. We will also adopt best practices for memory management, such as regularly checking for memory leaks using tools like Valgrind and ensuring that all memory allocations are properly deallocated. By proactively addressing these anticipated challenges and implementing strategic measures to mitigate risks, we aim to ensure the smooth progression and successful completion of our project.
\end{document}