Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Cs3630 project 4-particle filter implementation in webots simulator

SIMULATORIn Project 4, you will implement a particle filter in the Webot simulator. It builds upon the foundation established in prior labs, emphasizing on: 1. Coordinate Transformation: To understand and implement coordinate transformations between the world frame, sensor frame, and robot frame to enable accurate localization and perception in robotics tasks. 2. Differntial Drive: To understand and implement the kinematics of a differential drive robot. 3. Detection Failure and Spurious Detection Handling: To develop strategies to address detection failures and spurious detections, ensuring reliable performance of robotic perception systems in challenging scenarios. 4. Operating in Ambiguous Environments: To explore techniques for navigating larger environments with strong ambiguity, including solving the kidnapped robot problem, to enhance adaptability and robustness in real-world robotics applications.A. Project Structure: The project directory, Project_4 contains the following files: Worlds: • simple_world.wbt:1 A simple square world for testing the particle filter. (Ref. Figure 1b)Controllers: • proj4_simple_world1_controller.py: Controls the robot to turn in-place in the center in the simple world. • proj4_maze_world1_controller.py: Controls the robot’s movement along a predefined trajectory in the maze world. Particle_filter: • contour.py: Deals with extracting contours from camera images. • geometry.py: Contains functions related to geometric calculations and transformations. • gui.py: Creates a Graphical User Interface (GUI) for visualizing the particle filter simulation. • particle_filter.py: Has the particle filter algorithm. • run_pf.py: Reads captured data in “data” folder and run particle filter without Webots simulator. • sensors.py: Contains functions related to converting sensor data for the particle filter algorithm. • setting.py: Holds various configuration settings used throughout the project. • utils.py: Contains utility functions commonly used in the project. • unit_tests.py: Contains unit tests for functions in geometry.py, environment.py. It doesn’t test the implementation of particle filter.B. General Guidance: • You will implement functions in order in geometry.py, environment.py, and particle_filter.py. • Refer to the comments in each TODO for specific implementation guidance. • Use reference lectures and additional resources for understanding the particle filter algorithm. • Utilize the provided unit tests to validate your code by running unit_tests.py. • Submit the modified files in the controller’s folder as instructed.C. Instructions Please follow these instructions carefully to ensure successful project completion. First, complete code modifications in geometry.py and environment.py: o geometry.py: transform_point(), compose(), inverse() o environment.py: read_marker_measures() and diff_drive_kinematics() Run unit_test.py to verify your implementations in geometry.py and environment.py. Ensure all test cases pass before proceeding to the next steps. This step is crucial for validating the correctness of your code modifications. When the provided test cases in unit_tests.py pass, implement functions in particle_filter.py o particle_likelihood() o compute_particle_weights() Your goal is to make the estimated robot pose as close as possible to the ground-truth robot pose. You can open the world files in “worlds” folder and run the robot controller in Webots. The robot will move along a predefined trajectory. A Python GUI will display to visualize the particles. The larger triangles represent the robot and its estimation, and the smaller triangles indicate the particle poses. The orange and purple lines in front of the robot illustrate marker measurements by cameras and lidar, respectively, and the grey dashed lines are field-of-view of the robot. The small cyan dots represent the lidar array.You can verify the particle filter convergence in one of the following ways: o Test in Webots ▪ Set DATA_CAPTURE_MODE = False in setting.py. ▪ Open a Webots world file and run the attached controller. ▪ A python GUI will show up to visualize the particles. o Capture data in Webots then run particle filters separately ▪ Set DATA_CAPTURE_MODE = True in setting.py. ▪ Open a Webots world file and run the attached controller. The captured data will be stored in the “data” folder. ▪ Change SCENARIO_NAME variable to the desired world name in run_pf.py. ▪ Run run_pf.py in particle filter. ▪ A python GUI will show up to visualize the particles.It usually takes several minutes to finish running particle filter on the maps.D. Hints for code completion: • geometry.py: o The SE2 class represents a 2D pose/transformation, incorporating both position and orientation components. o Use this class to represent the pose of a coordinate frame, perform coordinate frame transformations, and apply transformation operations to rotate and translate coordinate frames or points. o Implement methods for point transformation (transform_point), composition of transformations (compose), and inversion of transformations (inverse). • environment.py: o The read_marker_measures function generates expected ground-truth marker measurements given the pose of the robot. o Utilize coordinate transformations to compute the marker positions relative to the robot’s pose. o Remember to handle visibility checks and consider hints provided in the function comments. o Utilize the provided robot and wheel radius to convert wheel rotational speeds into linear and angular velocities for the diff_drive_kinematics function. And then, apply the kinematic equations specific to differential drive robots to calculate the forward speed and counterclockwise rotational speed based on the rotational speeds of the left and right wheels. • particle_filter.py: o The particle_likelihood function calculates the likelihood of a particle pose being the robot’s pose based on observed marker measurements. o Treat unmatched particle marker measures as detection failures and unmatched robot marker measures as spurious detections. o Utilize helper functions like generate_marker_pairs and marker_likelihood implemented in Project 3.E. Submission:F. Grading: (Total 100 points) • Geometry.py (30 points) o transform_point () – 10 points o compose () – 10 points o inverse () – 10 points • environment.py (30 points) o read_marker_measures () – 20 points o diff_drive_odometry () – 10 points • Integration test (40 points) o simple map – 15 points o maze map – 25 pointsG. Additional Notes: • Run the provided local unit tests on the simple world before proceeding to the maze world. • Ensure your implementations adhere to the specified requirements and maximize your score potential by following the instructions carefully.

$25.00 View

[SOLVED] Cs3316 assignment 4

1 Introduction The goal of this assignment is to do experiment with deep Q network (DQN), which combines the advantage of Q-learning and the neural network. In classical Q-learning methods, the action value function Q is intractable with the increase of state space and action space. DQN introduces the success of deep learning and has achieved a super-human level of play in atari games. Your goal is to implement the DQN algorithm and its improved algorithm and play with them in some classical RL control scenarios. 2 Deep Q-learning Algorithm 1: deep Q-learning with experience replay. Initialize replay memory D to capacity N Initialize action-value function Q with random weights h Initialize target action-value function Q^ with weights h25h For episode 5 1, M do Initialize sequence s1~f gx1 and preprocessed sequence w1~wð Þs1 For t5 1,T do With probability e select a random action at otherwise select at~argmaxaQðwð Þst ,a;hÞ Execute action at in emulator and observe reward rt and image xt11 Set stz1~st,at,xtz1 and preprocess wtz1~wðstz1Þ Store transition wt,at,rt,wtz1 in D Sample random minibatch of transitions wj,aj,rj,wjz1 from D rj if episode terminates at step jz1 Setyj~rjzc maxa0 Q^ wjz1,a0;h{ otherwise 2 Perform a gradient descent step on yj{Q wj,aj;h with respect to the network parameters h Every C steps reset Q^~Q End For End For Figure 1: Deep Q-learning with experience replay You can refer to the original paper for the details of the DQN. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529. 3 Experiment Description • Programming language: python3 • You should compare the performance of DQN and one kind of improved DQN and test them in a classical RL control environment–MountainCar. OPENAI gym provides this environment, which is implemented with python (https://gym.openai.com/envs/MountainCar-v0/). What’s more, gym also provides other more complex environment like atari games and mujoco. Since the state is abstracted into car’s position, convolutional layer is not necessary in our experiment. You can get started with OPENAI gym refer to this link (https://gym.openai.com/docs/). Note that it is suggested to implement your neural network on the Tensorflow or Pytorch. 4 Report and Submission • Your report and source code should be compressed and named after “studentID+name+assignment4”.

$25.00 View

[SOLVED] Cs3316 assignment 3

1 Introduction The goal of this assignment is to do experiment with model-free control, including on-policy learning (Sarsa) and off-policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the differences between them, you will implement Sarsa and Q-learning at the application of the Cliff Walking Example, respectively. 2 Cliff WalkingFigure 1: Cliff Walking Consider the gridworld shown in the Figure 1. This is a standard undiscounted, episodic task, with start state (S), goal state (G), and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions except those into the region marked “The Cliff”. Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start. 3 Experiment Requirments • Programming language: python3 • You should build the Cliff Walking environment and search the optimal travel path by Sara and Q-learning, respectively. • Different settings for can bring different exploration on policy update. Try several (e.g. 1 and = 0) to investigate their impacts on performances. 4 Report and Submission • Your reports and source code should be compressed and named after ”studentID+name”.

$25.00 View

[SOLVED] Cs3316 assignment 2

1 Introduction The goal of this assignment is to do experiments with Monte-Carlo(MC) Learning and Temporal-Difference(TD) Learning. MC and TD methods learn directly from episodes of experience without knowledge of MDP model. TD method can learn after every step, while MC method requires a full episode to update value evaluation. Your goal is to implement MC and TD methods and test them in the small gridworld. 2 Small GridworldFigure 1: Gridworld As shown in Fig.1, each grid in the gridwold represents a certain state. Let st denotes the state at grid t. Hence the state space can be denoted as S = {st|t ∈ 0,..,35}. S1 and S35 are terminal states, where the others are nonterminal states and can move one grid to north, east, south and west. Hence the action space is A = {n,e,s,w}. Note that actions leading out of the grid leave state unchanged. Each movement get a reward of -1 until the terminal state is reached. 3 Experiment Requirments • Programming language: python3 • You should implement both first-visit and every-visit MC method and TD(0) to evaluate an uniform random policy π(n|·) = π(e|·) = π(s|·) = π(w|·) = 0.25. 4 Report and Submission • Your reports and source files (.py) should be compressed and named after “studentID+name”.

$25.00 View

[SOLVED] Cs3316 assignment 1

1 Introduction 2 Small GridworldFigure 1: Gridworld As shown in Fig.1, each grid in the Gridworld represents a certain state. Let st denotes the state at grid t. Hence the state space can be denoted as S = {st|t ∈ 0,..,35}. S1 and S35 are terminal states, where the others are nonterminal states and can move one grid to north, east, south and west. Hence the action space is A = {n,e,s,w}. Note that actions leading out of the Gridworld leave state unchanged. Each movement get a reward of -1 until the terminal state is reached. A good policy should be able to find the shortest way to the terminal state randomly given an initial non-terminal state. 3 Experiment Requirments • Programming language: python3 • You should build the Gridworld environment and respectively implement policy iteration and value iteration methods to improve an uniform random policy π(n|·) = π(e|·) = π(s|·) = π(w|·) = 0.25. 4 Report and Submission • Your report and source code should be compressed and named after “studentID+name”.

$25.00 View

[SOLVED] Cs3241 lab 3

LEARNING OBJECTIVES Writing OpenGL program to simulate planar reflection using texture-mapping and a multipass rendering technique. After completing the programming assignment, you should have learned how to • set up texture mapping in OpenGL, • model and draw texture-mapped objects, • set up off-center view frustum, • read back image in framebuffer for texture mapping, and • simulate planar reflection using a multi-pass rendering technique. TASKS From the Canvas > CS3241 > Files > Lab Assignments folder, download the ZIP file Lab3_todo_(*).zip.You are to complete an incomplete C++ application program, so that it can simulate planar reflection, using a multi-pass rendering technique and the texture-mapping capabilities provided by OpenGL. You have to complete the program according to the following requirements.Task 1 You have to complete main.cpp to produce the planar reflection that you see in the following sample images:+x main_done to give the file execute permission before running it.)Please read the instructions shown in the console/terminal window to learn how to operate the program.The 3D scene contains a table with a flat rectangular semi-reflective table-top. The scene is also populated with other objects, at least some resting on the table-top. The table-top must reflect the scene. Here are some additional requirements:• The reflection on the table-top is created by texture mapping a reflection image onto the tabletop rectangle. The reflection image is generated by drawing the scene seen from an imaginary viewpoint, which looks through the table-top from under the table. This rendered image is then copied from the color buffer to a texture object, to be used for texture mapping the table-top rectangle.• The reflection on the table-top should not be 100% (it is not a perfect mirror), and the underlying diffuse color and lighting on the table-top must still be visible. (Hint: use the correct texture function/environment.)• Mipmapping must be used for all texture mapping, including for the reflection texture mapping. For the texture object that contains the texture image copied from the color buffer, you have to set the texture object usingglTexParameteri(GL_TEXTURE_2D, GL_GENERATE_MIPMAP, GL_TRUE);• You are not allowed to use the stencil buffer for this assignment.• Write your code immediately below the locations marked “WRITE YOUR CODE HERE”. There are three of such locations.• You are allowed to modify only main.cpp. You are not required and must not change any other source files.Task 2You are allowed to modify only main.cpp. You should use your own new image(s) to texture-map your new object(s). As before, mipmapping must be used for all texture mapping.Besides your completed main.cpp, you also need to submit the new texture image(s).This task will be assessed based on the fulfillment of the basic requirements, on the technical difficulty and object’s complexity, and on the aesthetics and creativity.DO NOT HARD-CODE VALUES. You should write your code in such a way that when the values of the named constants (defined in the beginning of the program) are changed to other valid values, your program should function accordingly. For example, if the table’s height is changed, the reflection from the table-top should still look correct.GRADINGGood coding style. Comment your code adequately, use meaningful names for functions and variables, and indent your code properly. You must fill in your name, and NUS User ID in the header comment. SUBMISSION For this assignment, you need to submit only• Your completed main.cpp that contains code for both Task 1 and Task 2; • File(s) of your new texture image(s) for Task 2. They must be in the images subfolder. Total image files’ size must not exceed 5 MB.You must put it/them in a ZIP file and name your ZIP file nus-user-id_lab3.zip. For example, if your NUS User ID is e0123456, you should name your file e0123456_lab3.zip.Note that you will be penalized for submitting non-required files.——— End of Document ———

$25.00 View

[SOLVED] Cs3241 lab 1

LEARNING OBJECTIVES Basic OpenGL, input & interaction, and animation. After completing the programming assignment, you should have learned • the basic structure of an OpenGL program, • how to use some basic OpenGL functions, • how to use the GLUT (FreeGLUT) callback to get user input and enable interaction, • how to use double-buffering to make animation look smoother, and • how to use the GLUT (FreeGLUT) timer callback to control the speed of animation. TASKS From the Canvas > CS3241 > Files > Lab Assignments folder, download the ZIP file Lab1_todo_(*).zip.You are provided with an incomplete C++ application program main.cpp, and your job is to complete it according to the requirements. The program is supposed to do the followings: • Opens a blank window in the beginning. • Lets user click the left mouse button anywhere in the window to add a disc centered at the position of the mouse cursor. The disc is given a random size (there is a lower and an upper limits), a random speed (there is an upper limit), and a random color. There is a limit to the total number of discs that the user can add. • Once added, every disc continues to fly with a constant speed until it hits the window boundary. Then it is reflected (simple reflection with no energy loss). • User can press the ‘w’ key to toggle between wireframe polygon mode and filled polygon mode.+x main_done to give the file execute permission before running it.)Please read the instructions shown in the console window to learn how to operate the program. Try to resize the window and see what happens to the flying discs. Press the ‘w’ key to switch between wireframe polygon mode and filled polygon mode.Moving discs drawn in filled polygon mode.Moving discs drawn in wireframe polygon mode.Follow the following instructions to complete main.cpp as required. You must not add or change any other files.1) Study the source program very carefully.2) Complete the DrawDisc() function. The function must draw the input disc using GL_TRIANGLE_FAN in its color. You can refer to a. https://www.khronos.org/registry/OpenGL-Refpages/gl2.1/xhtml/glBegin.xml b. http://www.glprogramming.com/red/chapter02.html#name2 for more information on GL_TRIANGLE_FAN. Note that the vertices on the triangle fan must be provided in counter-clockwise direction. Since the trigonometric functions (e.g. sine and cosine) are quite expensive to compute, you should pre-compute the vertices once (for a unit-radius disc) and re-use them for all the discs later.3) Complete the MyMouse() function. If the left mouse button is pressed, and if the maximum limit on the number of discs has not been reached, a new disc is generated centered at the position of the mouse cursor. The disc is given • a random radius (value is between MIN_RADIUS and MAX_RADIUS), • a random speed in the x direction (value is between −MAX_X_SPEED and −MIN_X_SPEED and between MIN_X_SPEED and MAX_X_SPEED), • a random speed in the y direction (value is between −MAX_Y_SPEED and −MIN_Y_SPEED and between MIN_Y_SPEED and MAX_Y_SPEED), and • a random RGB color.4) Setting up the correct viewing in the MyReshape() function. You should use the glOrtho() or gluOrtho2D() function to set the viewing volume. The viewing volume should be set up in such a way that when the window is resized, the discs do not change their sizes or get distorted on the screen. Instead, the discs can move in the whole window interior, and get reflected by the boundaries of the new window.5) Complete the UpdateAllDiscPos() function. This function updates the position of each disc by its speed in each of the x and y directions. At its new position, if the disc is entirely or partially outside the left window boundary, then shift it right so that it is inside the window and just touches the left window boundary. Its speed in the x direction must now be reversed (negated). Similar update is applied for the cases of the right, top, and bottom window boundaries.6) Change the program to use double-buffering.7) If you have a fast computer, you will notice that the animation is too fast to show anything clearly. You can use the GLUT timer callback to control the speed of the animation by maintaining a constant frame rate (DESIRED_FPS). Refer to https://www.opengl.org/resources/libraries/glut/spec3/node64.html to find out more about the GLUT function glutTimerFunc().DO NOT HARD-CODE VALUES. You should write your code in such a way that when the values of the named constants (defined in the beginning of the program) are changed to other valid values, your program should function accordingly. For example, if the value of the constant MAX_NUM_OF_DISCS is changed, your program should allow only that new maximum number of discs to be added.GRADINGGood coding style. Comment your code adequately, use meaningful names for functions and variables, and indent your code properly. You must fill in your name, and NUS User ID in the header comment. SUBMISSION For this assignment, you need to submit only your completed main.cpp.You must put it/them in a ZIP file and name your ZIP file nus-user-id_lab1.zip. For example, if your NUS User ID is e0123456, you should name your file e0123456_lab1.zip.Note that you will be penalized for submitting non-required files.——— End of Document ———

$25.00 View

[SOLVED] Cs3241 lab 2

LEARNING OBJECTIVES OpenGL viewing, OpenGL transformations, hierarchical modeling, and animation. After completing the programming assignment, you should have learned • how to set up the view transformation (camera position and orientation) in OpenGL, • how to set up perspective viewing in OpenGL, • how to use the OpenGL transformations for modeling, and • how to use the OpenGL transformations for animation. TASKS From the Canvas > CS3241 > Files > Lab Assignments folder, download the ZIP file Lab2_todo_(*).zip.You are provided with an incomplete C++ application program main.cpp, and your job is to complete it according to the requirements.+x main_done to give the file execute permission before running it.)Please read the instructions shown in the console/terminal window to learn how to operate the program. When you run the program, you should see a spherical planet at the center of the window. There are a dozen cars moving on the planet surface, and each car moves in a different great circle (you can search the web to find out what a great circle is), and they have different speeds and colors.• resize the window and see what happens, • press the Up, Down, Left or Right arrow key to change the camera’s position, • press the Page Up or Page Down key to change the camera’s distance from the planet, • press the ‘P’ key to pause/resume the animation of the cars.You will notice that the camera is always looking at the center of the planet. With respect to the planet, the camera’s position can be expressed as latitude and longitude, and its distance from the planet’s center. When the Left or Right arrow key is pressed, the camera’s longitude decreases or increases, respectively; and when the Down or Up arrow key is pressed, the camera’s latitude decreases or increases, respectively. Note that the camera’s up-vector is always pointing north.Figure 1Follow the following instructions to complete main.cpp as required. You must not add or change any other files.1) Study the source program very carefully.2) Complete the DrawOneCar() function. The function must draw the car using only GLUT functions such as glutSolidCube(), glutSolidTorus(), glutSolidCone(), and glutSolidSphere(). You should not directly use any OpenGL geometric primitive. You should make use of the OpenGL 3D transformation functions to help you resize, orientate and position the parts. The functions glPushMatrix() and glPopMatrix() are very helpful for you to save and restore the current transformation before and after drawing each part. You can design your cars any way you like as long as they look like cars. More details about the GLUT functions can be found at https://www.opengl.org/resources/libraries/glut/spec3/spec3.html.3) Complete the DrawAllCars() function. This function draws each car at the correct position on its great circle. Note that any great circle on a sphere centered at the origin can be defined as follows. Let C be the great circle in the y = 0 plane, and let v be a vector in the x-z plane, and let  be an angle. If we rotate C about v by , we get another great circle of the sphere. All great circles of the sphere can be obtained by varying v and .4) Set up the correct perspective viewing volume in the MyDisplay() function. You should use the gluPerspective() function. The near and far planes should be set near the planet’s surface, yet still do not clip off any part of the planet and cars. The near and far planes should vary with the eye’s distance from the planet’s center. You should make use of the value of the predefined constant CLIP_PLANE_DIST to position your near and far planes.6) Complete the MyTimer() function. You should use the GLUT timer callback to control the speed of the animation by maintaining a constant frame rate (DESIRED_FPS). Refer to https://www.opengl.org/resources/libraries/glut/spec3/node64.html to find out more about the GLUT function glutTimerFunc().DO NOT HARD-CODE VALUES. You should write your code in such a way that when the values of the named constants (defined in the beginning of the program) are changed to other valid values, your program should function accordingly. For example, if the car’s size is changed, the tyre size should vary proportionally.GRADINGGood coding style. Comment your code adequately, use meaningful names for functions and variables, and indent your code properly. You must fill in your name, and NUS User ID in the header comment.SUBMISSION For this assignment, you need to submit only your completed main.cpp.You must put it/them in a ZIP file and name your ZIP file nus-user-id_lab2.zip. For example, if your NUS User ID is e0123456, you should name your file e0123456_lab2.zip.Note that you will be penalized for submitting non-required files.——— End of Document ———

$25.00 View

[SOLVED] Cs316 lab 7

1 Introduction 2 Liveness The first step in performing register allocation is performing a liveness analysis. We are asking you to perform liveness analysis across an entire function at once. See Week 12 (Slide 6 onwards) for more details. 2.1 Control flow graphs The first step in computing liveness is to build a control flow graph for each function in your program. To represent your control flow graph, each IR Node should know its successors (IR instructions that could possibly execute immediately after it) and predecessors (IR instructions that could possible execute immediately before it). Conditional jumps have two successors: the explicit target of the jump, and the implicit (fall-through) target of the jump. Unconditional jumps only have one successor. Function calls should be treated as straight-line IR nodes (i.e., they are not treated as branches; their successor is the instruction immediately after the call). Return nodes do not have any successors. 2.2 Computing Liveness For each IR node in a function, you should define two sets: GEN and KILL. GEN represents all the temporaries and variables that are used in an instruction, and KILL represents all the temporaries and variables that are defined in an instruction. For most instructions, this should be pretty straightforward. A few tricky cases: • PUSH instructions use the variable/temporary being pushed • POP instructions define the variable/temporary being popped • WRITE instructions use their variables. • READ instructions define their variables. • The set of variables that are live out of a node is the union of all the variables that are live in to the node’s successors. • The set of variables that are live in to a node is the set of variables that are live out for the node, minus any variables that are killed by the node, plus any variables that are gen-ed by the node. Note that in these definitions are recursive: the live-out set of a node is defined in terms of the live-in sets of its successors, which are in turn defined in terms of the live-in sets of their successors, and so on. If there is a loop in the code, then the definition seems circular. The trick to computing liveness is to compute a fixpoint: assignments to each of the live-in and live-out sets so that if you try to compute any node’s live-in or live-out set again, you’ll get the same result you already have. To do this, we will use a worklist algorithm: 1. Put all of the IR nodes on the worklist 2. Pull an IR node off the worklist, and compute its live-out and live-in sets according to the definitions above. 4. Repeat steps 2 and 3 until the worklist is empty. (Note: you can write a slower version of this code that ignores identifying nodes’ predecessors. This still works. However, we suggest you to put all the IR nodes on the worklist and process all of them. If any live-in or live-out set has changed, put all the IR nodes on the work list and repeat the process.) 3 Register Allocation Algorithm Use the bottom-up register allocation algorithm or the graph-coloring method discussed in class. For each statement, you must ensure that the source operands are in registers, and that there is a register for the destination operand. Use the liveness information you computed (i.e., the live-out set for the instruction) to determine when it is safe to free registers, and when a dirty register needs to be stored back to memory (only when the variable in the register is live). Bottom-up register allocation works at the basic-block level: any register allocation decisions you make apply for the current basic block only. This means that when you get to the end of a basic block, you must reset your register allocation. Any register that (a) holds local/global variables and (b) is dirty should be written back to the stack/global variable. Note also that because a CALL instruction jumps into another method, any global variables that are in registers when the CALL is performed should be freed immediately prior to the CALL instruction, ensuring that the correct value for the global is in memory. This is different from saving the registers on the stack prior to a function call. The latter is done so that the caller method doesn’t get its registers overwritten; the values of the registers are stored where only the caller can see them. The former is done so that the callee method sees the right values for global variables; the values need to be stored back to globals so that everyone can see them, and freed from the registers so that the caller will reload them after the callee returns. Testing your Tiny code You can test your Tiny code by using, tiny4regs.C, a version of the simulator that limits you to 4 registers. tiny4regs.C is provided to you along with the starter files. There are no test cases provided along with the starter files for this assignment. You must test your compiler with all the test cases of PA4, PA5, and PA6. In addition, we will test with some hidden test cases. 4 What you need to do Perform the liveness analysis and register allocation steps as described above, so that your compiler generates code that only uses 4 registers. Handling errors All the inputs we will give you in this step will be valid programs. We will also ensure that all expressions are type safe: a given expression will operate on either INTs or FLOATs, but not a mix, and all assignment statements will assign INT results to variables that are declared as INTs (and respectively for FLOATs). Grading In this step, we will only grade your compiler on the correctness of the generated code. We will run your generated code through the Tiny simulator and check to make sure that you produce the same result as our code. When we say result, we mean the outputs of any WRITE statements in the program (not details such as how many cycles the code uses, how many registers, etc.) 5 What you need to submit • Place all the necessary code for your compiler that you wrote yourself. • A Makefile with the following targets: 1. compiler: this target will build your compiler. (-1 for warnings) 2. clean: this target will remove any intermediate files that were created to build the compiler. (-1 for not doing the clean properly) 3. dev: this target will print the same information that you printed in previous PA. • A shell script (this must be written in bash) called runme that runs your compiler. This script should take in two arguments: first, the input program file to be compiled and second, the filename where you want to put the compiler. You can assume that we will have run make compiler before running this script. • You should tag your programming assignment submission as cs316pa7submission Do not submit any binaries. Your git repo should only contain source files; no products of compilation. If you have a folder named test in your repo, it will be deleted as part of running our test script (though the deletion won’t get pushed) – make sure no code necessary for building/running your compiler is in such a directory.

$25.00 View

[SOLVED] Cs316 lab 6

1 Introduction Your goal in this step is to generate code to handle programs with multiple functions. This means you will have to handle two aspects: (i) what should a caller function do to prepare for calling a subroutine; (ii) what should a callee function do to set up its local variables and environment? 1.1 Function Calls The primary mechanism for handling function calls is the program stack, which is where the local environment (activation record or frame) for each currently executing function (i.e., functions that have started executing but have not yet returned) is stored. Week 9 slides provide more details about how this program stack works. 1.2 Activation Records An activation record, or frame, stores all of the data required to execute a function. In particular, this means that the activation record stores all of the local variables in a function. We declare global variables with var declarations in Tiny code, but that doesn’t work for local variables. Why? Because a local variable is specific to that function invocation – it’s not global. Consider what would happen if you wrote a recursive function: the two versions of that recursive function each need their own copy of their local variables. An activation record is delimited by two “pointers”: the stack pointer (which is controlled with the instructions push and pop) and the frame pointer (which is controlled with the instructions link and unlink). The stack pointer points to the “top” of the stack, while the frame pointer points to the “base” of the activation record. In our stack organization, the stack conceptually grows “down”. Local variables thus have negative offsets from the frame pointer, while arguments and return values have positive offsets from the frame pointer You will need to augment your symbol table to maintain a mapping between each local variable and its slot in an activation record. (Don’t forget to reset the slot counter for each new function!) We recommend that you draw out the program stack for a simple program to understand how to correctly generate code for it. 1.3 Implementing a function call You can divide up the work done for a function call into two responsibilities: those of the caller and those of the callee. Here is what each one needs to do: Caller before the call 1. Push any registers that you want to save on the stack (using push) 2. Push a space on the stack for the return value of the callee 3. Push any arguments onto the stack 4. Call the function (using jsr) Note: in some of the outputs, step 1 is performed after 2 and 3; this is fine, as long as you are consistent and are able to correctly know where arguments/return values are Callee 1. Allocate space on the stack for all the local variables (using link) 2. Generate code, accessing local variables and arguments to the function relative to the frame pointer (Use $-n to access slots below the frame pointer, with n replaced with the slot location, and $n to access slots above the frame pointer) 3. When returning from the function, save the return value (if any) in the appropriate slot “above” the frame pointer (remember how the caller set up its portion of the stack). 4. Deallocate the activation record (using unlink) 5. Return to the caller (using ret) Caller after the call 1. Pop arguments off the stack 2. Pop the return value of the stack, remembering to store it in an appropriate place (local variable, global variable, register, etc., as needed by the source code) 3. Pop any saved registers off the stack. In this step, your code generation strategy likely means that no registers actually need to be saved on the stack by the caller, because none are “live” across the function call. If you choose not to save registers in this step, remember to add that functionality back in for the next step (register allocation) Testing your Tiny code You can test your Tiny code by using the same simulator as in the previous step. Your compiler will be tested against the inputs that we provide with the starter files and also some hidden test cases. 2 What you need to do In this step, you will be generating assembly code for function calls, as described above. You should correctly be able to handle functions with return values, functions where complex expressions are passed in as arguments (store the result in a temporary, then push that temporary onto the stack as the argument), and recursive functions. Handling errors All the inputs we will give you in this step will be valid programs. We will also ensure that all expressions are type safe: a given expression will operate on either INTs or FLOATs, but not a mix, and all assignment statements will assign INT results to variables that are declared as INTs (and respectively for FLOATs). Sample inputs and outputs are provided to you along with the starter files that you need to get started on this assignment. These contain a test case with non recursive functions (fma) and couple of test cases with recursive functions factorial2 and fibonacci. Grading In this step, we will only grade your compiler on the correctness of the generated code. We will run your generated code through the Tiny simulator and check to make sure that you produce the same result as our code. When we say result, we mean the outputs of any WRITE statements in the program (not details such as how many cycles the code uses, how many registers, etc.) 3 What you need to submit • A Makefile with the following targets: 1. compiler: this target will build your compiler. (-1 for warnings) 2. clean: this target will remove any intermediate files that were created to build the compiler. (-1 for not doing the clean properly) 3. dev: this target will print the same information that you printed in previous PA. • A shell script (this must be written in bash) called runme that runs your compiler. This script should take in two arguments: first, the input program file to be compiled and second, the filename where you want to put the compiler. You can assume that we will have run make compiler before running this script. • You should tag your programming assignment submission as cs316pa6submission Do not submit any binaries. Your git repo should only contain source files; no products of compilation. If you have a folder named test in your repo, it will be deleted as part of running our test script (though the deletion won’t get pushed) – make sure no code necessary for building/running your compiler is in such a directory.

$25.00 View

[SOLVED] Cs316 lab 5

1 Introduction Your goal in this step is to generate executable code for loops. You have already generated code for altering the flow of control in case of ’IF’ construct in the previous assignment. In this assignment, you will generate code for the ’WHILE’ loop, which could have BREAK and CONTINUE statements. Testing your Tiny code You can test your Tiny code by using the same simulator as in the previous step. Your compiler will be tested against only the inputs that we provide. However, in the next assignment, there will be hidden test cases . Sample inputs and outputs these are provided to you along with the starter files that you need to get started on this assignment. Handling errors All the inputs we will give you in this step will be valid programs. We will also ensure that all expressions are type safe: a given expression will operate on either INTs or FLOATs, but not a mix, and all assignment statements will assign INT results to variables that are declared as INTs (and respectively for FLOATs). Grading In this step, we will only grade your compiler on the correctness of the generated code. We will run your generated code through the Tiny simulator and check to make sure that you produce the same result as our code. When we say result, we mean the outputs of any WRITE statements in the program (not details such as how many cycles the code uses, how many registers, etc.) 2 What you need to submit • Place all the necessary code for your compiler that you wrote yourself. You do not need to include the ANTLR jar files if you are using ANTLR. • A Makefile with the following targets: 1. compiler: this target will build your compiler. If you are using ANTLR, this should create a .jar file. 2. clean: this target will remove any intermediate files that were created to build the compiler. 3. dev: this target will print the same information that you printed in previous PA. • A shell script (this must be written in bash) called runme that runs your compiler. This script should take in two arguments: first, the input program file to be compiled and second, the filename where you want to put the compiler. You can assume that we will have run make compiler before running this script. • You should tag your programming assignment submission as cs316pa5submission 1 Do not submit any binaries. Your git repo should only contain source files; no products of compilation. If you have a folder named test in your repo, it will be deleted as part of running our test script (though the deletion won’t get pushed) – make sure no code necessary for building/running your compiler is in such a directory. 2

$25.00 View

[SOLVED] Cs316 lab 4

1 Introduction Your goal in this step is to generate executable code for statements including expressions, assignment, READ/WRITE, and IF. To do this, you will build semantic actions that generate code in an intermediate representation (IR) for statements, and then translate that intermediate representation to assembly code. We recommend that you do this in three steps, as it will make it easier to debug your code, but you can also choose to do it in two steps (step 1 is optional): 1. Generate an abstract syntax tree (AST) for the code in your function. 2. Convert the AST into a sequence of IR Nodes that implement your function using three address code. 3. Traverse your sequence of IR Nodes to generate assembly code. (Note: in this step, we will only have one function in your program, main.) 2 Abstract Syntax Tree An Abstract Syntax Tree is essentially, a cleaned up form of your parse tree that more straightforwardly captures the structure of expressions, control constructs, etc. in your program. For many compilers, the AST is the intermediate representation, though we will further convert the AST into another intermediate representation. What is the difference between a parse tree and an AST ? Parse trees capture all of the little details necessary to implement your grammar. This means that it often contains extraneous information beyond what is necessary to capture the details of a piece of code (e.g., there are nodes for tokens like “;”, and nodes for all of the sub-constructs we used to correctly implement order of operations). ASTs, in contrast, contain exactly the information needed to capture the meaning of an expression, including being structured to preserve order of operations. For example, consider the parse tree for a + b * c:Complicated, huh? Here’s an abstract syntax tree that captures the same thing:Much simpler! We aren’t preserving anything except the bare minimum needed to describe the expression (note that we included the type of each of the variables in the program – we can get that information from our symbol table!). Building an AST 2.1 Building an AST • add_op : generate an AddExpr AST node that has two children (that you leave uninitialized) and keeps track of the operator (+ or -). 1. If expr_prefix is NULL, make the add_op node’s left child the node from factor and return up the add_op node (note that it won’t have its right child filled in!) 2. If expr_prefix isn’t NULL, note that it will be missing its right child. Make the factor node its right child, then make the expr_prefix node the add_op node’s left child, which you pass up. The basic idea: creating AST nodes when you have the information for a new node, then filling in various fields of the node as you work your way up the parse tree, will let you eventually create an AST for all the statements in the function. Hint: you should also create an AST node to capture lists of statements; each element of the list will point to an AST node for a single assign_stmt. 2.2 ASTs for Control Structures ASTs for control structures are, intuitively, simple: each control structure will have several children (3 in the case of an IF statement, 4 in the case of a FOR loop) that are themselves ASTs (ASTs for statement lists in the case of the bodies of IF statements and FOR loops, ASTs for conditional expressions in the case of the conditions in the IF statements and FOR loops, etc.). You can extend your code for building an AST for statement lists (and can readily adapt your code for binary expressions to build ASTs for conditional expressions), all you have to do is create semantic actions for the control structures that ‘stitch together” the existing ASTs. 3 IR: 3 Address Code The next step in our compilation process is to generate 3 Address Code (3AC), which is our intermediate representation. 3AC is an intermediate representation where each instruction has at most two source operands and one destination operand. Unlike assembly code, 3AC does not have any notion of registers. Instead, the key to 3AC is to generate temporaries – variables that are used to hold the intermediate results of computations. For example, the 3AC for d := a + b * c (where all variables are integers) will be: MULTI b c $T1 ADDI a $T1 $T2 STOREI $T2 d 3.1 Generating 3AC Generating 3AC is straightforward from an AST. We can perform a post-order walk of the tree, passing up increasingly longer sequences of IR code called CodeObjects. Each code object retains three pieces of information: 1. A sequence of IR Nodes (a structure representing a single 3AC instruction) that holds the code for this part of the AST (i.e., that implements this part of the expression) 2. An indication of where the “result” of the IR code is being stored (think: the name of the temporary or variable where the result of the expression is stored) 3. An indication of the type of the result (INT or FLOAT) Then, when we encounter something like an AddExpr Node, we can generate code for the overall expression as follows: 1. Create a new CodeObject whose code list is all the code from the left child of the AddExpr followed by all the code for the right child. 2. Use the result fields of the left and right CodeObjects to create a new 3AC instruction performing the add, storing the result in a new temporary. Add this new instruction to the end of your code list 3. Indicate in your CodeObject the temporary where the result is stored, and its type. 4. Return the new CodeObject up the AST as part of your post-order walk. Hint: the CodeObject for a simple variable won’t have any 3AC code associated with it. Instead, mark the variable itself as the “temporary” the result is stored in. Hint: You will need a helper function to generate “fresh” temporaries. Then, when you get to the top of the AST, you will have a single CodeObject that contains all of the IR code for the entire main function. Note: We are generating code by performing a post-order walk of the AST. You can also generate code using this strategy by performing a post-order walk of the parse-tree (which is why you can optionally skip building the AST). 3.2 Generating 3AC for Control Structures Generating 3AC for control structures builds on your ability to generate code for lists of statements. This means that when you are generating code for an IF AST node, you know that the 3AC for the three children already exists. All that is left is to put them together in the correct order and insert any necessary labels and jumps. There are two things that you need to pay attention to when putting together 3AC: Generating Labels : at various points in your code, you will need to insert labels and jumps to allow control to transfer from one part of your code to another. You will need to make sure that you can generate unique labels every time (since your code will not work properly if there are multiple labels with the same name). The 3AC you will generate for labels looks like: LABEL STRING, where STRING is whatever name you decide to give to your label Generating Jumps : unconditional jumps (like you might use to jump over an ELSE block) are easy: JUMP STRING, where STRING is the label you want to jump to. Conditional jumps are a little bit tricky in our 3AC (and in Tiny): you need to generate the right kind of jump. The list of 3AC instructions for conditional jump are provided in the next Section. 3.3 3AC instructions Here are the 3AC instructions you should use: ADDI OP1 OP2 RESULT ( Integer add ; RESULT = OP1 + OP2) SUBI OP1 OP2 RESULT ( Integer sub ; RESULT = OP1 − OP2) MULI OP1 OP2 RESULT ( Integer mul ; RESULT = OP1 ∗ OP2) DIVI OP1 OP2 RESULT ( Integer div ; RESULT = OP1 / OP2) ADDF OP1 OP2 RESULT ( Floating point add ; RESULT = OP1 + OP2) SUBF OP1 OP2 RESULT ( Floating point sub ; RESULT = OP1 − OP2) MULF OP1 OP2 RESULT ( Floating point mul ; RESULT = OP1 ∗ OP2) DIVF OP1 OP2 RESULT ( Floating point div ; RESULT = OP1 / OP2) STOREI OP1 RESULT ( Integer store ; store OP1 in RESULT) STOREF OP1 RESULT ( Floating point store ; store OP1 in RESULT) READI RESULT (Read integer from console ; store in RESULT) READF RESULT (Read float from console ; store in RESULT) WRITEI OP1 (Write integer OP1 to console ) WRITEF OP1 (Write float OP1 to console ) WRITES OP1 (Write string OP1 to console ) GT OP1 OP2 LABEL ( If OP1 > OP2 Goto LABEL) GE OP1 OP2 LABEL ( If OP1 >= OP2 Goto LABEL) LT OP1 OP2 LABEL ( If OP1 < OP2 Goto LABEL) LE OP1 OP2 LABEL ( If OP1

$25.00 View

[SOLVED] Cs316 lab 3

1 Introduction Your goal in this step is to process variable declarations and create a Symbol Table. A symbol table is a data structure that keeps information about non-keyword symbols that appear in source programs. Variable and function names are examples of such symbols. The symbols added to the symbol table will be used in many of the further phases of the compilation. In the previous assignment, your compiler answered whether an input program was syntactically valid or not. As a result, you didn’t need token values and only the token types were used by the parser generator tools to guide the parsing. A compiler’s job, in addition to answering if a program is syntactically valid, is to translate a syntactically valid program. So, in this assignment, your parser needs to get token values such as identifier names and string literals from your scanner. You also need to add semantic actions to create symbol table entries and add those to the symbol table. 2 Background 2.1 Semantic Actions Semantic actions are steps that your compiler takes as the parser recognizes constructs in your program. Another way to think about this is that semantic actions are code that executes as your compiler matches various parts of the program (constructs like variable declarations or tokens like identifiers). By taking the right kind of action when the right kind of construct is recognized, you can make your compiler do useful work! STRING foo := “Hello World”; Which produces the (partial) parse tree below:We can create semantic records for each of the tokens IDENTIFIER and STRINGLITERAL that record their values (“foo” and “Hello World”, respectively), and “pass them up” the tree so that those records are the records for id and str. We can then construct a semantic record for string_decl using the semantic records of its children to produce a structure that captures the necessary information for a string declaration entry in your symbol table (and even add the entry to the symbol table). 2.1.1 Parser-driven Semantic Actions For both of these, it is worth remembering two things: 1. Tokens become leaves in the parse tree. The semantic record for a token is either always the text associated with that token (if you’re using ANTLR) or whatever you assign yylval to in your scanner (if you’re using flex/bison). 2. Every symbol that shows up in a grammar rule will be a node in your parse tree, and that if you recognize a grammar rule, there will be a node in your parse tree associated with the left-hand-side of the rule and that node will have a separate child for each of the symbols that appear on the right hand side. 2.1.2 ANTLR In ANTLR, you can put arbitrary Java code in braces ({}) at various points in the right hand side of a grammar rule in your .g4 file; this code will execute as soon as the symbol immediately before the braces has been matched (if it’s a token) or predicted and matched (if it’s a non-terminal). The “main” semantic action for a rule will go at the very end of the rule (in other words, it will execute once the entire rule has been matched). As part of this rule, you can assign a value to the semantic record that will be associated with the left hand side of the rule. You name the semantic record and tell ANTLR what type that record should have using a returns annotation. You can then extract information from the semantic records of the children by giving variable names to the children you have matched. You can either access that child’s semantic record by accessing $a.x where a is the name you gave the child and x is the name you gave to the child’s semantic record in the returns annotation, or you can access the characters associated with the child (useful when matching tokens) with $a.text. So suppose we wanted to return a structure called StrEntry from our string_decl rule. It might look something like this: string_decl [returns StrEntry s] : STRING id ASSIGNOP ex=str SEMICOLON {$s = new StrEntry(); s.addID($id.text); s.addValue($ex.text);} (Note that if you don’t give a child a name (like id in the above example), ANTLR defaults to using the name of the token or non-terminal. If the same symbol shows up twice in a rule, you must give them names.) 2.1.3 Bison In Bison, you can put arbitrary C (or C++) code in braces at various points in the right hand side of a grammar rule in your .y file. It works very similarly to ANTLR in that respect. The key difference in Bison is understanding how to pass data around. To set the type of a semantic record for a non-terminal, you use a %type command: %type string_decl %type id str The above tells that the type associated with the symbol id and str is s. Also, the type associated with the symbol string_decl is s_entry. All of the types that you want to use (as part of semantic actions) need to be part of a union that determines the possible types for yylval, which you declare in a %union declaration: %union{ StrEntry ∗ s_entry ; string ∗ s ; } The above tells that StrEntry and string are the alternatives for type of a semantic record and the type has been given a name—s_entry for StrEntry and s for string. So the %type commands and %union command in conjunction mean that the string_decl non-terminal will produce a semantic record named s_entry of type StrEntry * and the non-terminals id and str will produce semantic records of type string * named s. (You can also do this by making yylval a struct instead of a union). Then, when building a semantic action for a rule, $$ refers to the semantic record you are building for the left hand side (whose type is determined by the %type command), and $1, $2, etc. refer to the semantic records for the right hand side, listed in order. So here is the equivalent action for creating a StrEntry object for the str_decl rule: string_decl : STRING id ASSIGN_OP str SEMI {$$ = new StrEntry(); $$->addID($2); $$->addValue($4);}; 3 Symbol Tables Your task in this step of the project is to construct symbol tables for each scope in your program. For each scope, construct a symbol table, then add entries to that symbol table as you see declarations. The declarations you have to handle are integer/float declarations, which should record the name and type of the variable, and string declarations, which should additionally record the value of the string. Note that typically function declarations/definitions would result in entries in the symbol table, too, but you do not have to record them for this step. Nested Symbol Tables In this year’s variant of Micro, there are multiple scopes where variables can be declared: • Variables can be declared before any functions. These are “global” variables, and can be accessed from any function. • Variables can be declared as part of a function’s parameter list. These are “local” to the function, and cannot be accessed by any other function. • Variables can be declared at the beginning of a function body. These are “local” to the function as well. • Variables can be declared at the beginning of a IF-then block, an ELSE block, or a WHILE statement. These are “local” to the block itself. Other blocks, even in the same function, cannot access these variables. Note that the scopes in the program are nested (function scopes are inside global scopes, and block scopes are nested inside function scopes, or each other). You will have to keep track of this nesting so that when a piece of code uses a variable named “x” you know which scope that variable is from. To handle this, we suggest you to follow the implementation strategy based on hash-tables discussed in class. 4 What you need to do You should define the necessary semantic actions and data structures to let you build the symbol table(s) for Micro input programs. At the end of the parsing phase, you should print out the symbols you found. For each symbol table in your program, use the following format: Symbol table name type name type value ; . . . The global scope should be named “GLOBAL”, function scopes should be given the same name as the function name, and block scopes should be called “BLOCK X” where X is a counter that increments every time you see a new block scope. Function parameters should be included as part of the function scope. The order of declarations matters! We expect the entries in your symbol table to appear in the same order that they appear in the original program. Keep this in mind as you design the data structures to store your symbol tables. See the sample outputs for more complete examples of what we are looking for. Handling errors If there are two declarations with the same name in the same scope your compiler should output the string DECLARATION ERROR (previous declaration was at line .) Sample inputs and outputs are provided to you along with the starter files that you need to get started on this assignment. 5 What you need to submit • Place all the necessary code for your compiler that you wrote yourself. You do not need to include the ANTLR jar files if you are using ANTLR. • A Makefile with the following targets: 1. compiler: this target will build your compiler. If you are using ANTLR, this should create a .jar file. 2. clean: this target will remove any intermediate files that were created to build the compiler 3. dev: this target will print the same information that you printed in previous PA. • A shell script (this must be written in bash) called runme that runs your compiler. This script should take in two arguments: first, the input program file to be compiled and second, the filename where you want to put the compiler. You can assume that we will have run make compiler before running this script. • You should tag your programming assignment submission as cs316pa3submission Do not submit any binaries. Your git repo should only contain source files; no products of compilation.

$25.00 View

[SOLVED] Cs2200 project 1

Pipelining Extra Credit: LC-3300-pipe 1 Why Pipelining? The datapath design that we implemented for Project 1 was, in fact, grossly inefficient. By focusing on increasing throughput, a pipelined processor can get more instructions done per clock cycle. In the real world, that means higher performance, lower power draw, and most importantly, happy customers! 2 Project Requirements In this extra credit project, you will make a pipelined processor that implements the LC-3300-pipe ISA. There will be five stages in your pipeline: 1. IF – Instruction Fetch 2. ID/RR – Instruction Decode/Register Read 3. EX – Execute (ALU operations) 4. MEM – Memory (both reads and writes with memory) 5. WB – Writeback (writing to registers) Before you move on, read Appendix A: LC-3300-pipe Instruction Set Architecture to understand the ISA that you will be implementing. Understanding the instructions supported by your ISA will make designing your pipeline much easier. 3 Building the Pipeline First, you will have to build the hardware to support all of your instructions. You will have to make each stage such that it can accommodate the actions of all instructions passing through it. Use the book (Ch. 5) to get an idea of what the pipeline looks like and to understand the function of each stage before you start building your circuits.Figure 1: Pipeline Diagram 1. IF Stage Functionality: • PC Update: In the IF stage, we need to update the PC’s value in order to fetch the proper next instruction. For normal sequential execution, the IF stage should update the PC by incrementing it by 1. However, in the case of branches such as with JALR and BEQ, the PC’s value should be set to whatever new address was calculated during the EX Stage. • Fetch from I-MEM: We will then use the PC value to index into I-MEM and retrieve an instruction. Note that I-MEM has 16 address bits, so you will need some circuit to reduce the PC’s bits from 32 bits to 16 bits. Considerations for Implementation: • Selecting the proper PC: Choosing whether the PC should be updated to PC+1 or to the address calculated in EX can be accomplished with a multiplexer. Consider what the selector bits for this MUX must be; they come from the EX Stage. 2. ID/RR Stage Functionality: • Obtaining Data from the Instruction: The first thing to accomplish in the ID/RR stage is to obtain the opcode of the 32 bit instruction that was fetched in the IF stage. Consider all other values that need to be obtained directly from the instruction, such as register numbers and the immediate/offset value. • Decoding the Instruction: Once we have the opcode for our instruction, we need to determine the specific control signals that need to be asserted. Like our LC-3300 uniprocessor that we implemented in projects 1 and 2, we can use ROMs that store microcode for each instruction’s control signals. Given the opcode, the ROMs should output the necessary control signals required for a specific instruction. • Reading Registers: Our instruction might require that we read from one or two registers. If reading from a single register, we use only 1 port in our dual-ported register file (DPRF), using the register number specified in the instruction to read a value. If given two register numbers, then we utilize both ports in the DPRF to read from both simultaneously. • Selecting “A” and “B”: Like the ALU in projects 1 and 2, the EX Stage utilizes two operands, “A” and “B”. You must select the proper value for A and the proper value for B based on the instruction being executed (for example, select PCOffset20, immval20, a register value, etc.). Considerations for Implementation: • ROMs: Unlike the processor implemented in projects 1 and 2, there is no need for next state bits. Implement a component using one or more ROMs that takes in an opcode and returns the control signals associated with that opcode. • Dual Ported Register File: When implementing the DPRF, consider that you only need to read from two registers at once. You will not have to write to more than one register simultaneously. • Data Forwarding: If you decide to implement data forwarding, note that your selector for A and B should also be able to select any forwarded values. 3. EX Stage Functionality: • Calculation: The EX Stage must compute calculations using A and B. For example, it will calculate the sum for an ADD instruction, or the Base + Offset computation for LW and SW. This should be done with a complete ALU, capable of performing adding, nanding, shifting, and any other required operation. • Branch Evaluation: The EX Stage must evaluate a branch condition, and then determine if the branch is taken or not taken. This can be performed by installing a comparison logic unit. Considerations for Implementation: • Branching: As previously mentioned, the IF stage will require information about the branch evaluation performed in the EX Stage. Consider implementing forwarding lines that can forward this information between stages. 4. MEM Stage Functionality: • Address Calculation: The effective address for memory operations is derived from the Execute (EX) stage, typically involving arithmetic operations or immediate values combined with base register values. In systems with a 16-bit address space, it is crucial to use only the lower 16 bits of the calculated address, requiring a masking operation to discard any higher-order bits. • Read Operation: Load instructions necessitate reading data from the memory address calculated in the MEM stage. This involves initiating a memory access with the calculated address and retrieving the corresponding data. • Write Operation: Store instructions require writing data from a source register to the memory address determined in the MEM stage. The operation must ensure data is correctly written to the intended memory location. 5. WB Stage Functionality: • The WB stage selectively writes values back to the registers. This involves interfacing directly with the data in and write enable inputs of the DPRF, ensuring that results of computations or memory operations are correctly stored. • The dual-ported nature of the DPRF allows simultaneous read and write operations on different registers within the same clock cycle. This design facilitates sharing of the DPRF between the ID/RR and WB stages. Considerations for Implementation: • Control Logic: Implementing control logic within the WB stage is crucial for determining whether a write-back operation is required based on the instruction type. • Data Selection: The WB stage must select the correct data source for write-back operations. This is typically between data fetched from memory or computation results generated in previous stages. Multiplexers, guided by control signals, can be a good design choice. 4 General Advice 4.1 Pipeline Buffers • Identify and support the requirements of all possible instructions by analyzing their needs. • Pass a union of all requirements through the buffers to ensure functionality across diverse instructions. • Optionally, implement dynamic buffer space utilization to optimize for instruction-specific requirements. 4.2 Control Signals • Reflect on the shift from a single ROM source for control signals to more flexible implementation strategies. • Consider each pipeline stage as a standalone processor that performs a specific task within one cycle, reducing the need for centralized control ROM. • For simplicity and ease of debugging, a control ROM is suggested to generate control signals for each pipeline stage. 4.2.1 Control Signal Implementation Options We have provided an optional microcode sheet that you can use to translate an opcode into signals. It is up to the programmer to decide what signals are needed; you will likely not need to use every column. 1. Opt for smaller, stage-specific ROMs that generate signals based on the opcode, and pass the opcode through buffers. 2. Use a single, large main ROM at the ID/RR stage for all control signals, and pass all necessary control signals through buffers. 3. Other implementations for control signals are also accepted if they work. 4.3 Stalling and Data Forwarding 4.3.1 Stalling the Pipeline • Initiate stalling when data hazards prevent instruction progression to maintain data integrity. • Implement stalling by disabling buffer writes in preceding stages and issuing NOOP instructions until the hazard is resolved. 4.3.2 Implementing Data Forwarding • Utilize data forwarding to minimize stalls by enabling early access to values computed in later stages. • Design a forwarding unit that evaluates the necessity for forwarding based on register comparisons. • Address limitations, acknowledging scenarios like ”load-to-use” hazards that still require stalling. • Data forwarding with WAW hazards – The priority encoder is a hardware component in CircuitSim that can choose priority between stages, resolving all WAW hazards with no additional bubbles. – Busy bits, as detailed in the textbook, involve stalling the pipeline upon encountering a WAW hazard, until the first instruction exits the WB stage 4.3.3 Special Considerations for Forwarding and Stalling • Implement selective forwarding logic for instructions that do not perform register writes. Keep in mind: the zero register can never change, therefore it should not be considered for forwarding and stalling situations. Additionally not all instructions will be writing back to a register, so blindly checking bits [27-24] does not work for a lot of instructions. Forwarding however cannot save you from one situation: when the destination register of a LW instruction is the source register of an instruction immediately after it. In this case, sometimes called “load-to-use”, you must stall the instruction in the ID/RR stage. It is your job to flesh out all of the stall and forwarding rules. 4.4 Branch Prediction • Always predict branch-not-taken, and clear IF and IDRR when a branch is taken. • Address control hazards by predicting branch outcomes, with a default prediction that the branch is not taken. • Implement hardware mechanisms to handle both correct predictions (continue normally) and incorrect predictions (flush incorrectly fetched instructions). 4.5 Flushing the Pipeline • Develop a flushing mechanism for when branch instructions in the EX stage render previously fetched instructions incorrect. • Avoid the asynchronous clear feature of registers to prevent timing issues, a multiplexer-based approach to selectively send NOOP instructions could be helpful. 5 Report Alongside the project, you will be required to submit a written report, rough 2-3 pages in length. The report should be presentable, with appropriate formatting. • Explanation of how to load your ROM(s) with your microcode. • Explanation of the pipeline implementation (in particular the design of each stage and the data forwarding mechanism). • Challenges that were faced during development. • Results relating to cycle count when running the pow.s file, and any associated pipelining metrics that were taught in class. • Potential areas of improvement or further optimization. Submissions without a report will result in no extra credit points. 6 Testing Be careful to only use the instructions listed in the appendix – there are some subtle points in having a separate instruction and data memory. Load the assembled program into both the instruction memory and the data memory and let your processor execute it. Any writes to memory will only affect the data memory. 7 Grading 8 Deliverables To submit your project, you need to upload the following files to Gradescope: • LC-3300-pipe.sim • Microcode file (microcode.xlsx) if applicable • Report file as a PDF Always re-download your assignment from Gradescope after submitting to ensure that all necessary files were properly uploaded. If what we download does not work, you will not get credit regardless of what is on your machine. 9 Appendix A: LC-3300-pipe Instruction Set Architecture The LC-3300-pipe is a simple, yet capable computer architecture. The LC-3300-pipe combines attributes of both ARM and the LC-2200 ISA defined in the Ramachandran & Leahy textbook for CS 2200. The LC-3300-pipe is a word-addressable, 32-bit computer. All addresses refer to words, i.e. the first word (four bytes) in memory occupies address 0x0, the second word, 0x1, etc. All memory addresses are truncated to 16 bits on access, discarding the 16 most significant bits if the address was stored in a 32-bit register. This provides roughly 64 KB of addressable memory. 9.1 Registers The LC-3300-pipe has 16 general-purpose registers. While there are no hardware-enforced restraints on the uses of these registers, your code is expected to follow the conventions outlined below. Table 1: Registers and their Uses Register Number Name Use Callee Save? 0 $zero Always Zero NA 1 $at Assembler/Target Address NA 2 $v0 Return Value No 3 $a0 Argument 1 No 4 $a1 Argument 2 No 5 $a2 Argument 3 No 6 $t0 Temporary Variable No 7 $t1 Temporary Variable No 8 $t2 Temporary Variable No 9 $s0 Saved Register Yes 10 $s1 Saved Register Yes 11 $s2 Saved Register Yes 12 $k0 Reserved for OS and Traps NA 13 $sp Stack Pointer No 14 $fp Frame Pointer Yes 15 $ra Return Address No 1. Register 0 is always read as zero. Any values written to it are discarded. Note: for the purposes of this project, you must implement the zero register. Regardless of what is written to this register, it should always output zero. 3. Register 2 is where you should store any returned value from a subroutine call. 4. Registers 3 – 5 are used to store function/subroutine arguments. Note: registers 2 through 8 should be placed on the stack if the caller wants to retain those values. These registers are fair game for the callee (subroutine) to trash. 5. Registers 6 – 8 are designated for temporary variables. The caller must save these registers if they want these values to be retained. 7. Register 12 is reserved for handling interrupts. While it should be implemented, it otherwise will not have any special use on this assignment. 8. Register 13 is the everchanging top of the stack; it keeps track of the top of the activation record for a subroutine. 9. Register 14 is the anchor point of the activation frame. It is used to point to the first address on the activation record for the currently executing process. 10. Register 15 is used to store the address a subroutine should return to when it is finished executing. 9.2 Instruction Overview The LC-3300-pipe supports a variety of instruction forms, only a few of which we will use for this project. The instructions we will implement in this project are summarized below. Table 2: LC-3300-pipe Instruction Set 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0000 DR SR1 unused SR2 0001 DR SR1 unused SR2 0010 DR SR1 immval20 0011 DR BaseR offset20 0100 SR BaseR offset20 0101 SR1 SR2 offset20 0110 AT RA unused 0111 unused 1000 SR1 SR2 offset20 1001 DR unused PCoffset20 1010 DR SR1 unused 00 SR2 1010 DR SR1 unused 01 SR2 1010 DR SR1 unused 10 SR2 1010 DR SR1 unused 11 SR2 ADD NAND ADDI LW SW BEQ JALR HALT BGT LEA SLL SRL ROL ROR 9.2.1 Conditional Branching Branching in the LC-3300-pipe ISA is a bit different than usual. We have a set of branching instructions including BEQ, BGT, and FABS which offer the ability to branch upon a certain condition being met. These instructions use comparison operators, comparing the values of two source registers. If the comparisons are true (for example, with the BGT instruction, if SR1 > SR2), then we will branch to the target destination of incrementedPC + offset20. For FABS, if SR < 0 then we will branch to the series of microstates for negation. 9.3 Detailed Instruction Reference 9.3.1 ADD Assembler Syntax ADD DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0000 DR SR1 unused SR2 Operation DR = SR1 + SR2; Description The ADD instruction obtains the first source operand from the SR1 register. The second source operand is obtained from the SR2 register. The second operand is added to the first source operand, and the result is stored in DR. 9.3.2 NAND Assembler Syntax NAND DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0001 DR SR1 unused SR2 Operation DR = ~(SR1 & SR2); Description The NAND instruction performs a logical NAND (AND NOT) on the source operands obtained from SR1 and SR2. The result is stored in DR. HINT: A logical NOT can be achieved by performing a NAND with both source operands the same. For instance, NAND DR, SR1, SR1…achieves the following logical operation: DR←SR1. 9.3.3 ADDI Assembler Syntax ADDI DR, SR1, immval20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0010 DR SR1 immval20 Operation DR = SR1 + SEXT(immval20); Description The ADDI instruction obtains the first source operand from the SR1 register. The second source operand is obtained by sign-extending the immval20 field to 32 bits. The resulting operand is added to the first source operand, and the result is stored in DR. 9.3.4 LW Assembler Syntax LW DR, offset20(BaseR) Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0011 DR BaseR offset20 Operation DR = MEM[BaseR + SEXT(offset20)]; Description An address is computed by sign-extending bits [19:0] to 32 bits and then adding this result to the contents of the register specified by bits [23:20]. The 32-bit word at this address is loaded into DR. 9.3.5 SW Assembler Syntax SW SR, offset20(BaseR) Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0100 SR BaseR offset20 Operation MEM[BaseR + SEXT(offset20)] = SR; Description An address is computed by sign-extending bits [19:0] to 32 bits and then adding this result to the contents of the register specified by bits [23:20]. The 32-bit word obtained from register SR is then stored at this address.9.3.6 BEQ BEQ SR1, SR2, offset20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0101 SR1 SR2 offset20 Operation if (SR1 == SR2) { PC = incrementedPC + offset20 } Description A branch is taken if SR1 is equal to SR2. If this is the case, the PC will be set to the sum of the incremented PC (since we have already undergone fetch) and the sign-extended offset[19:0]. 9.3.7 JALR Assembler Syntax JALR AT, RA Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0110 AT RA unused Operation RA = PC; PC = AT; Description First, the incremented PC (address of the instruction + 1) is stored into register RA. Next, the PC is loaded with the value of register AT, and the computer resumes execution at the new PC. 9.3.8 HALT Assembler Syntax HALT Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0111 unused Description The machine is brought to a halt and executes no further instructions. 9.3.9 BGT BGT SR1, SR2, offset20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1000 SR1 SR2 offset20 Operation if (SR1 > SR2) { PC = incrementedPC + offset20 } Description A branch is taken if SR1 is greater than SR2. If this is the case, the PC will be set to the sum of the incremented PC (since we have already undergone fetch) and the sign-extended offset[19:0]. 9.3.10 LEA Assembler Syntax LEA DR, label Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1001 DR unused PCoffset20 Operation DR = PC + SEXT(PCoffset20); Description An address is computed by sign-extending bits [19:0] to 32 bits and adding this result to the incremented PC (address of instruction + 1). It then stores the computed address into register DR. 9.3.11 SLL Assembler Syntax SLL DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 00 SR2 Operation DR = SR1 > SR2; Description The value stored in SR1 is logically right shifted by the value stored in SR2, and the result is stored in DR. 9.3.13 ROL Assembler Syntax ROL DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 10 SR2 Operation DR = (SR1 > (32 – SR2)); Description Bits in SR1 are ”rotated” left by SR2 number of bits using circular shifting. During a left rotation, the bits that are shifted out from the left are brought back in on the right side. 9.3.14 ROR Assembler Syntax ROR DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 11 SR2 Operation DR = (SR1 >> SR2) | (SR1

$25.00 View

[SOLVED] Cs2200 project 5-networking

1 Introduction This project will expose you to each layer in the network stack and its respective purpose. In particular, you will “upgrade” the transport layer of a simulated network to make it more reliable. As part of completing this project, you will: • Further explore the use of threads in an operating system, especially the network implementation. • Demonstrate how messages are segmented into packets and how they are reassembled. • Understand why a checksum is needed and when it is used (including implementing your own). • Understand and implement the stop-and-wait protocol with Positive (ACK) Acknowledgements, Negative (NACK) Acknowledgements, and retransmissions. • Create your own new protocol called Reliable Transport Protocol (RTP). For a description of the Stop-and-Wait Protocol, read Section 13.6.1 in your textbook. Our system consists of a client and a server. The client will be sending encrypted messages to the server. The server will then decrypt the messages and reply with the decrypted message. Then, the client will print out the full, decrypted, message that it received from the server. You can see a list of those decrypted messages in the client.c file we provide you. 2 Requirements As you work through this project, you will be completing various portions of code in C. There are two files you will need to modify: • rtp.c: the main RTP protocol implementation, including your packetize() function, checksum() function, and send and receive threads • rtp.h: to add any necessary fields to the rtp connection t struct As you should strive for any programming assignment, we expect quality code. In particular, your code must meet the following requirements: • The code must not generate any compiler warnings. • The code must not print extraneous output by default (i.e. any debug printfs must be disabled by default). • The code must be reasonably robust and free of memory leaks. • The code must protected shared resources among concurrent threads. 3 The Protocol Stack We have provided you with code that implements the network protocol: Application Layer Transport Layer Network Layer Data Link Layer Physical Layer o client.c o rtp.c o network.o Figure 1: The Protocol Stack • For the purpose of this project, the data link layer and the physical layer are both implemented by the operating system and the underlying network hardware. • We have implemented our own network layer and provided it to you through the files network.h and network.c. You will have to use the provided functions from these files to access the network layer. (Hint: when we are ready to send a packet, what function must we call?) • The transport layer uses the services of the network layer to provide a specialized protocol to the application. The transport layer typically provides TCP or UDP services to the application using the IP services provided by the network layer. For this project, you will be writing your own transport layer. This includes implementing the various transport layer services for packetizing and queuing a buffer of arbitrary length. • The application layer represents the end user application. The application simply makes the appropriate API calls to connect to remote hosts, send and receive messages, and disconnect from remote hosts. The message will be sent in encrypted with ROT13, and decrypted when receiving the actual message. 4 Code WalkthroughThe client program takes two arguments. The first argument is the server it should connect to (such as localhost), and the second argument is the port it should connect to (such as 4000). Thus, the client can be run as follows. $ ./rtp-client localhost 4000 The specific server and port you run on is up to you. We suggest standardizing these arguments in your testing. However, the client and server will need to communicate via the same port, so whichever argument you use to run the server will also need to be used on the client. More debugging tips can be seen in the appendix. 4.1 High-level Logic The client.c program represents the application layer. It uses the services provided by the transport layer (rtp.c). It begins by connecting to the remote host. Look at the rtp connect connection in rtp.c (this is a struct). It simply uses the services provided by the network layer to connect to the remote host. Next, the rtp connect function initializes its rtp connection structure, initializes its send and receive queue, initializes its mutexes, starts its threads, and returns the rtp connection structure. Next, the client program sends a message encode using ROT13 cryptography (the letters of the alphabet are offset 13 places) to the remote host using rtp send message(). The rtp send message() message makes a copy of the information to send, places the message into a send queue, and returns so that the application can continue to do other things. A separate thread, the rtp send thread actually sends the data across the network. It waits for a message to be placed into the send queue, then extracts that message from the queue and sends it. Next, the client program receives a decrypted message from the network. What happens if a message isn’t available or the entire message has not yet been received? The rtp receive message() function blocks until a message can be pulled from the receive queue. The rtp recv thread actually receives packets from the network and reassembles the packets into messages. Once it receives a message, it places the message into the receive queue so that rtp receive message can extract it and return it to the application layer. The client program continues to send and receive messages until it is finished. Last, the client program calls rtp disconnect() to terminate the connection with the remote host. This function changes the state of the connection so that other threads will know that this connection is dead. The rtp disconnect() function then calls net disconnect(), signals the other threads, waits for the threads to finish, empties the queues, frees allocated space, and returns. 4.2 Packets and Types For the purposes of this project, there are five packet types: • DATA – a data packet that contains part of a message in its payload. • LAST DATA – just like a data packet, but also signifies that it is the last packet in a message. • ACK – acknowledges the receipt of the last packet • NACK – a negative acknowledgement stating that the last packet received was corrupted. • TERM – tells the server to shut down (you don’t need to worry about this one as it’s only used in the provided code). The packet format is defined in network.h. Each packet has a payload, which can be up to MAX PAYLOAD LENGTH bytes, a payload length indicator, type field, and a checksum. 5 Part I: Segmentation of Data When data is sent over a network, the data is chopped up into one or more parts and sent inside packets. A packet contains information that describes the message such as the destination of the data, the source of the data, and the data itself! The data being sent over the network is referred to as the ’payload’. Look in network.h; what other fields does our network packet carry? Think about why each field is needed. How much payload data can we fit into each packet? (Note: as with many things in this project, the packet data structure is simplified). (Part A) Open rtp.c and find the packetize function. Complete this function. Its purpose is to turn a message into an array of packets. It should: 1. Allocate an array of packets big enough to carry all of the data. (Don’t be afraid to take advantage of the heap. The provided code in the send thread function will free this array once it is done). 2. Populate all the fields of the packet including the payload. (a) Remember, the last packet should be a LAST DATA packet. All other packets should be DATA packets. THIS IS IMPORTANT. The server checks for this, and it will disconnect you if they are not filled in correctly. If you neglect the LAST DATA packet, your program will hang forever waiting for a response from the server, because it is waiting on you forever to send a terminating packet. (b) What is the length of the payload? This might depend on whether we are looking at the last packet or not. (c) You can simply call the checksum() function for now (you will implement this as well). 3. The count variable points to an integer. Update this integer setting it equal to the length of the array you are returning. 4. Return the array of packets. Hint: Remember that this is integer division. If length % MAX PAYLOAD LENGTH = 0 this is a special case that should be handled. There are several other parts of the source code that say FIX ME. The code to be inserted in these parts of the program will simply provide additional functionality but are not necessary at this time. We will return to these parts of the code in Part II. 6 Part II: Computing the Checksum In the stop-and-wait protocol, the sending thread does the following things: 1. Sends one packet at a time. 2. After each packet, wait for an ACK or a NACK to be received. 3. If a NACK is received, resend the last packet. Otherwise, send the next packet. Meanwhile, how does the receiving thread know whether to send an ACK or a NACK back to the server? It utilizes a concept called the checksum, which determines whether the packet sent was corrupted by performing a particular weighted algorithm on its characters. Thus the receiving thread will: 1. Compute the checksum for each packet payload upon arrival. 2. If the checksum does not match the checksum reported in the packet header, send a NACK. If it does match, send an ACK. This way we can ensure not to send corrupted data up to the application layer. (Part A) Open rtp.c and find the checksum function. In this project the checksum is computed as follows. 1. First, shuffle the characters in the string such that the second half of the string is interleaved with the first half. For example, if our string is of the form a1a2a3b1b2b3 we should manually reorder them such that we produce a1b1a2b2a3b3 2. Following this, find the ASCII values of all of the characters. The checksum will be equal to the sum of the first two characters multiplied by 2, plus the sum of the next two characters divided by 2, and so on. 3. Return the final sum calculated. This is how the server computes the checksum, thus the client must do the same. As an example, suppose we are given the string ”abcdef”. When we reorder it, we obtain ”adbecf”. Then we have (a ∗ 2) + (d ∗ 2) + (b/2) + (e/2) + (c ∗ 2) + (f ∗ 2) Note that if the string is of odd length, take the floor of the middle for the start of b1,b2,…. Thus a string in the form ”abcde” becomes ”acbde” since c is the character at index 5/2. This is important. If your implementation does not account for odd length strings, you will not pass the autograder. 7 Part III: Receive and Send Implementations Now that our checksum is computed, we can take a hold of the sending and receiving threads and properly communicate between the server and the client. Depending on the values returned by checksum, we will return either a Negative Acknowledgement (NACK) or Positive Acknowledgment (ACK). (Part A) Open rtp.c and find the rtp recv thread function. Find the line that says FIX ME: Part III-A. Complete the following steps. 1. If the packet is a DATA packet, the payload is added to the current buffer. Note that we should only do this if the checksum of the data matches the checksum in the packet header. (a) Make sure to send the relevant packets to the server so it knows whether to continue sending you packets or resend the last packet. (b) Note that if this is the last packet in the sequence, and the packet was corrupted, we don’t want the loop to terminate. 2. Next, implement the code that will signal the sending thread that a NACK or ACK has been received. This means producing a mechanism for the sending thread to see which type of acknowledgement it was (hint: it’s okay to add fields to the rtp connection t data structure). Make sure you protect against concurrent modifications where necessary. (Part B) Open rtp.c and find the rtp recv thread function again. Find the line that says FIX ME: Part III-B. At this point in the function, an entire message has been received. 2. Then, add that message variable to the rtp client’s queue using the provided queue add function. Do this in a thread-safe manner, and signal the appropriate condition variable to let the client know a full message has been received (hint: are we accessing the sending queue or receiving queue?). (Part C) Open rtp.c and find the rtp send thread function. Find the line that says FIX ME: Part III-C. At this point, you should wait to be signaled by the receiving thread that a NACK or ACK has been received. 1. If notified that it was an ACK, continue as normal. 2. If notified that it was a NACK, resend the last packet. You should NOT call net receive packet in the send thread. The receiving thread is responsible for receiving packets. 8 Running the Project Note that running and debugging the project outside of docker is possible. First, you will need to make sure that python 3 is installed on your system to run the server. If you do not have python installed, you can install it with the following commands: $ sudo apt-get update $ sudo apt-get install python3.6 Next, to compile all of the code you wrote, use the following command: $ make Recall also that we can use $ make clean to empty out our executable if necessary. Our project now consists of a Python file we provide you (rtp-server.py) which will run the server, and the executable you just made (rtp-client). We will run these in tandem to start running the project. First, run the server. Open a command prompt and run $ python rtp-server -p [port number] -c [corruption_rate] with your own desired arguments for port number and corruption rate. For example, you might use 8080 for port number, and 0.80 for corruption rate (this has to be a value between 0 and 1). Then you would say $ python rtp-server -p 8080 -c 0.8 $ ./rtp-client [host] [port number] With the example from above, we might run $ ./rtp-client 127.0.0.1 8080 You’ll want to start an instance of the server first, then run the client. “OSError: [Errno 98] Address already in use”. This generally occurs when a server is running in the background. There are many methods to solve this, but we’d recommend using the following commands: $ ps -ef |grep -v ’grep’ | grep ’python3.6 rtp-server.py’ burdell 7302 7261 0 4:20 pts/1 00:00:00 python3.6 rtp-server.py -p 8080 -c 0 $ sudo kill -9 7302 Note: The first number is the process ID, or PID, used in the kill command. Managing multiple terminal sessions can sometimes be tricky, so you might be interested in using tmux to manage the terminals running your client and the server if you have trouble. The server will take the client’s messages and decode them using ROT13. The server will be printing out debug statements in order for you to understand what it is doing. 9 Debugging Tips 2. Memcpy is a valid library function for the payload. 3. Make sure you set the fields for the last packet in packetize correctly (in particular, its size). 4. What kind of field in rtp.h could help us know what kind of acknowledgement we have? It might be helpful not to overcomplicate this. 5. Don’t be afraid to use the heap in other functions outside packetize as well. 6. In the sending thread, make sure you implement threading correctly (do we want an if or a while loop?). 10 Deliverables

$25.00 View

[SOLVED] Cs2200 project 3 – virtual memory

1 Introduction Read the entire document before starting. There are critical pieces of information and hints along the way. In this project, you will be implementing part of a virtual memory system simulator. You have been given a simulator which is missing some critical parts. You will be responsible for implementing these parts. Detailed instructions are in the files to guide you along the way. If you are having trouble, we strongly suggest that you take the time to read about the material from the book and class notes. This project is divided into 10 problems. The files that you will be modifying are the following: • va_splitting.h – Break down a virtual address into its components. • mmu.c – Initialize any system- and memory access-related bookkeeping. • proc.c – Initialize any process-related bookkeeping. • page_fault.c – Implement the page fault handler. • page_replacement.c – Implement the frame eviction algorithms FIFO and Clock Sweep algorithm. • stats.c – Calculate the Average Memory Access Time (AMAT) It will be a good idea to peek into following the files: • mmu.h – Defines the structures used by mmu.c • proc.h – Defines the structures used by proc.c • pagesim.h – Defines simulation parameters as well as global structures. • pagesim.c – Reads a trace file of memory operations and calls each operation’s corresponding function implemented in proc.c. • stats.h – Defines parameters that can be used when calculating AMAT. • swap.c – Initializes functions to support a queue that are used in swapops.c • swapops.c – Initializes functions to keep track of the frames swapped out to physical memory. Discussed in section 8 • types.h – Defines different types that are used throughout the simulation • util.c – Initializes a random function used for the random replacement algorithm 2 Memory Organization Background The simulator simulates a system with 20-bit byte-addressable of physical memory (20-bit physical address space). Throughout the simulator, you can access physical memory through the global variable uint8_t mem[] (an array of bytes called “mem”). You have access to, and will manage, the entirety of physical memory. The system has a 24-bit virtual address space and memory is divided into 16KB pages. Like a real computer, your page tables and paging data structures live in physical memory too! Conveniently, both a page table and the frame table fit in a single page frame in memory and so you’ll want to dedicate someFigure 1: Organization of physical memory showing frames and frame entries. page frames for storing this data. You are responsible for placing and initializing these structures in memory. Note: Since user data page frames and operating system page frames such as the frame table and page tables coexist in the same physical memory, we must have some way to differentiate between the two, and keep user pages from replacing system pages. For this project, we will take a simple approach: Every page frame has a “protected” bit in its frame table entry, which is set to “1” for system frames and “0” for user frames. In other words, we’ll set the protected bit to 1 for the frames holding paging system meta-data and 0 for the page frames holding user data pages. 3 Address Splitting In most modern operating systems, user programs access memory using virtual addresses. The hardware and the operating system work together to turn the virtual address into a physical address, which can then be used to address into physical memory. The first step of this process is to translate the virtual address into two parts: the higher order bits for the VPN, and the lower bits for the page offset. Implement the vaddr_vpn and vaddr_offset functions in va_splitting.These will be used to split a virtual address into its corresponding page number and page offset. You will need to use the parameters for the virtual memory system defined in pagesim.h (PAGE_SIZE, MEM_SIZE, etc.).Figure 2: Virtual addresses containing VPN and offset 4 Initialization When the simulation starts, it will need to first set up the frame table (sometimes known as a “reverse lookup table”). Implement the system_init function in mmu.c.For simplicity, we always place the frame table in physical frame 0. You need to initialize the frame_table pointer to the start of this frame. Don’t forget to mark the first entry (which corresponds to the frame holding the frame table) of the frame table as “protected” because we will never evict the frame table. Then, during your page replacement, you will need to make sure that you never choose a protected frame as your victim. After setting up the frame table, we will need to set up a page table at the start of each process. Implement the proc_init() function in proc.c.Since processes can start and stop any time during your computer’s lifetime, we must be a little more sophisticated in choosing which frames to place their page tables. For now, we won’t worry about the logistics of choosing a frame–just call the free_frame function you’ll write later in page_replacement.c. Then make the appropriate flags for that frame table entry. (HINT: Do we ever want to evict the frame containing the page table while the process is running?) After picking the frame for a process’s page table, remember to update the ptbr in the process pcb struct. Each frame contains PAGE_SIZE bytes of data, therefore to access the start of the i-th frame in memory, you can use mem + (i * PAGE_SIZE). 5 Context Switches and the Page Table Base Register As you know, every process has its own page table. When the processor needs to perform a page lookup, it must know which page table to look in. This is where the page table base register (PTBR) comes in. In the simulator, you can access the page table base register through the global variable pfn_t PTBR. Implement the context_switch function in proc.c.Your job is to update the PTBR to refer to the new process’s page table. This function will be very simple. Going forward, pay close attention to the type of the PTBR. The PTBR holds a physical frame number (PFN), not a virtual address. Think about why this must be. 6 Reading and Writing Memory The ability to allocate physical frames is useless if we cannot read or write to them. In this section, you will add functionality to the simulator to allow it to make read and write memory references on behalf of the simulated processes. The simulator will use pre-recorded lists of memory references captured (traced) from the execution of real processes to simulate the memory references in each process. Because processes operate on a virtual memory space, it is necessary to first translate a virtual address supplied by a process into its corresponding physical address, which then will be used to access the location in physical memory. This is accomplished using the page table, which contains all of a process’s mappings from virtual addresses to physical addresses. As previously mentioned in Section 3, when running a user process, all addresses from the CPU are virtual and must be translated. Do note that for this project, we assume that when the operating system is running (i.e. the CPU is in system mode), address translation is disabled and all memory addresses referenced by the CPU will be treated as physical addresses. This is why the operating system itself has no page table! Implement the mem_access function in mmu.c.You will need to use the passed-in virtual address to find the correct page table entry and the offset within the corresponding page. HINT: Use the virtual address splitting functions that you wrote earlier in the project. Once you have identified the correct page table entry, you must use this to find the corresponding physical frame and its physical address in memory, and then perform the read or write at the proper location within the page. (Remember that the simulator’s physical memory is represented by the mem array and its subscripts are the physical memory addresses). Keep in mind that not all entries in a process’s page table have necessarily been mapped. Entries not yet mapped are marked as invalid, and an attempt to access an invalid address should generate a page fault. You will write the page_fault() function in the next section, so for now just assume that it has successfully allocated a page for that address after it returns. After we are sure the correct page is mapped in for the requested virtual address, we want to make sure the page is marked as referenced in the page table. This is used later by the page replacement algorithms. Make sure to mark the containing page as “dirty” in the process’s page table on a write. These bits will be used later when deciding on what pages should be evicted first, and if an evicted page needs to be written to the disk to preserve its content. 7 Eviction and Replacement Recall that when a CPU encounters an invalid VPN to PFN mapping in the page table, the OS allocates a new frame for the page by either finding an empty frame or evicting a page from a frame that is in use. In this section, you will be implementing a page fault and replacement mechanism. Implement the function page_fault() in page_fault.c. A page fault occurs when the CPU attemptsto translate a virtual address, but finds no valid entry in the page table for the VPN. Our page fault handler will provide the CPU with a valid frame filled either with its previously evicted data or with zeros (meaning it was never written to before). To resolve the page fault, you must do the following: 1. Get the page table entry for the faulting virtual address. 2. Using free_frame() get the PFN of a new frame. 3. Check to see if there is an existing swap entry for the faulting page table entry. More on swap in section 8. 4. If a swap entry exists, then use swap_read() to read in the saved frame into the new frame, otherwise clear the new frame. 5. Update the mapping from VPN to the new PFN in the current process’ page table. 6. Update the flags in the corresponding frame table entry. Next, we will turn our attention to the eviction process in page_replacement.c. When we allocate frames for our pages, they needed to have come from somewhere. Throughout the whole project, you have been using free_frame() without implementing it. Now, it is finally time to implement free_frame(). Implement free_frame() in page_replacement.c.free_frame() will return the frame selected from select_victim_frame() (discussed in section 10). If the victim frame is already mapped, you will effectively evict it. This would mean updating both the page and frame table entries with the appropriate flags (recall what mem_access() looked for to check for evicted pages). If the evicted frame is dirty, then you will need to swap_write() (discussed in section 8) it into the swap space and clear the dirty bit. 8 Swap Space If the evicted page is dirty, you will need to write its contents to disk. The area of the disk that stores the evicted pages is called the swap space. Swap space effectively extends the main memory (RAM) of your system. If physical memory is full, the operating system kicks some frames to the hard disk to accommodate others. When the “swapped” frames are needed again, they are restored from the disk into physical memory. Without the swapping mechanism, when the system runs out of RAM and we start evicting physical frames, we lose the data stored in these frames, and the process whose pages were originally mapped to the evicted frames loses its data forever. Therefore, upon selecting a victim, we need to make sure that its data is swapped out to disk and restored when needed. Recall that during your page fault handler, you must fill the newly provided frame with any data that was previously written to this page that was evicted to the swap space. To do this, we have provided the method swap_exists() that checks if the faulting page was swapped out to disk previously. If it has, then you need to restore it, using swap_read() If the faulting page has not been swapped previously, then you need to zero out the freed frame to prevent the current process from potentially reading the memory of some other process. To write the contents of a victim’s page to disk, we provide a method called swap_write(), where you can pass in a pointer to the victim’s page table entry and a pointer to the frame in memory. This will simulate swapping the page to disk. 9 Finishing a Process If a process finishes, we don’t want it to hold onto any of the frames that it was using. We should unmap any frames so that other processes can use them (remember not to unmap pages that are now being used by other processes). Also: If the process is no longer executing, can we release the page table? As part of cleaning up a process, you will need to also free any swap entries that have been mapped to pages. You can use swap_free() to accomplish this. Implement the function proc_cleanup() in proc.c.10 Better Victim Selection In section 7, we relied on the select_victim_frame() function to tell us which frame to choose as our “victim”. We have provided you with a default, inefficient page replacement algorithm that randomly selects a page to be evicted. The simulator can run this replacement strategy out-of-the-box so that you can test the other parts of your code without having to write a page replacement algorithm. Run the simulator with -rrandom to use the random algorithm. 10.1 FIFO Algorithm Of course, we can do better than random replacement. Implement the first-in, first-out replacement algorithm. Your FIFO algorithm should choose the least recently mapped frame table entry based on the global last_evicted, which is the offset into the frame table to the last evicted frame. Loop around the frame table until you find the first unprotected frame and return it. Once you have implemented the FIFO algorithm, you will be able to run the simulator with the -rfifo argument to use the algorithm as your page replacement strategy. Remember again that if the protected bit is set, it should never be chosen as a victim frame. 10.2 Clock Sweep Algorithm After implementing the FIFO algorithm, implement the Clock Sweep algorithm. When implementing the Clock Sweep algorithm, every page table entry has a reference bit which is set once the page has been accessed. When looking for a victim, the Clock Sweep algorithm will choose the first page that does not have its reference bit set to 1. If all of the page table entries have their reference bit set then this will become FIFO. Look at section 8.3.5 in the textbook for more information on this. Remember that your sweep should start where the previous sweep ended. You can use the global last_evicted to keep track of this. Once you have implemented the Clock Sweep algorithm, you will be able to run the simulator with the -rclocksweep argument to use the algorithm as your page replacement strategy. Remember again that if the protected bit is set, it should never be chosen as a victim page. Once you write your stats function in section 11, compare the performance of the three algorithms. What do you observe? 11 Computing AMAT In the final section of this project, you will be computing some statistics. • accesses – The total number of accesses to the memory system • page faults – Accesses that resulted in a page fault • writebacks – How many times you wrote to disk • AMAT – The average memory access time of the memory system We will give you some numbers that are necessary to calculate the AMAT: • MEMORY ACCESS TIME – The time taken to access memory SET BY SIMULATOR • DISK PAGE READ TIME – The time taken to read a page from the disk SET BY SIMULATOR • DISK PAGE WRITE TIME – The time taken to write to disk SET BY SIMULATOR You will need to implement the compute_stats() function in stats.c 12 Simulator Process DiagramFigure 3: This diagram gives a general overview of how the simulator works. 13 How to Run / Debug Your Code 13.1 Environment 13.2 Compiling and Running We have provided a Makefile that will run gcc for you. To compile your code with no optimizations (which you should do while developing, it will make debugging easier) and test with the “random” algorithm, run: $ make $./vm-sim -i traces/.trace -rrandom Once your Clock Sweep algorithm has been implemented, you can run the program with the -rclocksweep argument in order to test. For example, you should run: $ make $./vm-sim -i traces/.trace -rclocksweep Once your FIFO algorithm has been implemented, you can run the program with the -rfifo argument in order to test. For example, you should run: $ make $./vm-sim -i traces/.trace -rfifo We highly recommend starting with “simple.trace.” This will allow you to test the core functionality of your virtual memory simulator without worrying about context switches or write backs, as this trace contains neither. 13.3 Inputting Trace Lines The sim allows to run with the -s flag that will allow you to input the commands in the trace files from the command line. Run: $ make $./vm-sim -s -rrandom The commands from the trace files come in 3 forms. Start Process: START Stop Process: STOP Memory Access: A simple example that starts a process, saves and then reads some data, and ends the process would look like: $ make $./vm-sim -s -rrandom Input trace lines. START 3 0: PID 3 started 3 w d84f78 7 1: 3 w 0xd84f78 07 STOP 3 3: PID 3 stopped You will need to control-c to quit the simulation. 13.4 Corruption Checker One challenge of working with any memory-management system is that your system can easily corrupt its own data structures if it misbehaves! Such corruption issues can easily hide until many cycles later, when they manifest as seemingly unrelated crashes later. To help with detecting these issues, we’ve included a “corruption check” mode that aggressively verifies your data structures after every cycle. To use the corruption checker, run the simulator with the -c argument: $./vm-sim -c -i traces/.trace -r 13.5 Debugging Tips with GDB If your program is crashing or misbehaving, you can use GDB to locate the bug. GDB is a command line interface that will allow you to set breakpoints, step through your code, see variable values, and identify segfaults. There are tons of online guides, click here (http://condor.depaul.edu/glancast/373class/docs/gdb.html) for one. 13.5.1 Compiling with Debugging Information To compile with debugging information, you must build the program with make debug: $ make clean $ make debug 13.5.2 Starting GDB To start your program in gdb, run: $ gdb ./vm-sim 13.5.3 Setting Breakpoints $ (gdb) break pagesim.c:53 ! set breakpoint at call to system_init $ (gdb) r -i traces/.trace -r ! (wait for breakpoint) $ (gdb) s ! step into the function call …or by using the actual function name being called from the main loop: $ (gdb) break sim_cmd ! set breakpoint at call to sim_cmd $ (gdb) r -i traces/.trace -r ! (wait for breakpoint) $ (gdb) s ! step into the function call 13.5.4 Stepping Through Code Once execution pauses at a breakpoint, you can step through your code using the step command. For example: $ (gdb) s ! step into the function call 13.5.5 Examining Memory $ (gdb) x/24xb frame_table 0x1004000aa: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x1004000b2: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x1004000ba: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 13.5.6 Backtracing If your program crashes, you can use backtracing to see the call stack at the point of the crash. For example: $ (gdb) backtrace 13.5.7 Using the Corruption Checker If you use the corruption checker, you can set a breakpoint on panic() and use a backtrace to discover the context in which the panic occurred: $ (gdb) break panic $ (gdb) r -i traces/.trace -r ! (wait for GDB to stop at the breakpoint) $ (gdb) backtrace $ (gdb) frame N ! where N is the frame number you want to examine 13.6 Verifying Your Solution On execution, the simulator will output data read/write values. To check against our solutions, run $ ./vm-sim -i traces/.trace -r > my_output.log $ diff my_output.log outputs/.log The second half of the output file name includes the type of replacement algorithm that should be run when comparing the output. Ex. astar-random.log should be compared with the output from using random replacement algorithm (−rrandom) as shown below. $ ./vm-sim -i traces/astar-random.trace -rrandom > my_output.log $ diff my_output.log outputs/astar-random.log You MUST implement the Clock Sweep algorithm in order to test against all the *-clocksweep.log output files. 14 How to Submit Run make submit to automatically package your project for submission. Submit the resulting tar.gz zip on Canvas. Always re-download your assignment from Canvas after submitting to ensure that all necessary files were properly uploaded. If what we download does not work, you will get a 0 regardless of what is on your machine. This project will be demoed. In order to receive full credit, you must sign up for a demo slot and complete the demo. We will announce when demo times are released. 15 Debugging Tips with GDB If your program is crashing or misbehaving, you can use GDB to locate the bug. GDB is a command line interface that will allow you to set breakpoints, step through your code, see variable values, and identify segfaults. Below are some helpful debugging tips with GDB: 15.0.1 Compiling with Debugging Information To compile with debugging information, you must build the program with make debug: $ make clean $ make debug 15.0.2 Starting GDB To start your program in gdb, run: $ gdb ./vm-sim 15.0.3 Setting Breakpoints $ (gdb) break pagesim.c:53 ! set breakpoint at call to system_init $ (gdb) r -i traces/.trace -r ! (wait for breakpoint) $ (gdb) s ! step into the function call …or by using the actual function name being called from the main loop: $ (gdb) break sim_cmd ! set breakpoint at call to sim_cmd $ (gdb) r -i traces/.trace -r ! (wait for breakpoint) $ (gdb) s ! step into the function call 15.0.4 Stepping Through Code Once execution pauses at a breakpoint, you can step through your code using the step command. For example: $ (gdb) s ! step into the function call 15.0.5 Examining Memory $ (gdb) x/24xb frame_table 0x1004000aa: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x1004000b2: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x1004000ba: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 15.0.6 Backtracing If your program crashes, you can use backtracing to see the call stack at the point of the crash. For example: $ (gdb) backtrace 15.0.7 Using the Corruption Checker If you use the corruption checker, you can set a breakpoint on panic() and use a backtrace to discover the context in which the panic occurred: $ (gdb) break panic $ (gdb) r -i traces/.trace -r ! (wait for GDB to stop at the breakpoint) $ (gdb) backtrace $ (gdb) frame N ! where N is the frame number you want to examine Remember: Always run your program through GDB before asking for help with a segfault. These are just a few basic GDB commands to get you started. Feel free to explore more commands and consult GDB documentation for advanced debugging techniques.

$25.00 View

[SOLVED] Cs2200 project 4 – process scheduling & threading

Project 4: Process Scheduling Simulation 1 Overview In this project, you will implement a multiprocessor operating system simulator using a popular threading library for Linux called pthreads. The framework for the multiprocessor OS simulator is nearly complete, but missing one critical component: the process scheduler! Your task is to implement the process scheduler and three different scheduling algorithms. The simulated operating system supports only one thread per process making it similar to the systems that we discussed in Chapter 6. However, the simulator itself will use a thread to represent each of the CPUs in the simulated hardware. This means that the CPUs in the simulator will appear to operate concurrently. We have provided you with source files that constitute the framework for your simulator. You will only need to modify answers.txt and student.c. However, it is in your best interest to read and understand the other files, and it will make it much easier to implement the functions in student.c. We have provided you the following files: 1. os-sim.c – Code for the operating system simulator which calls your CPU scheduler. 2. os-sim.h – Header file for the simulator. 3. process.c – Descriptions of the simulated processes. 4. process.h – Header file for the process data. 5. student.c – This file contains stub functions for your CPU scheduler. 6. student.h – Header file for your code to interface with the OS simulator. Also contains the ready queue struct definition. 7. answers.txt – This is a text file that you should use to write your answers to the questions listed throughout the PDF. 1.1 Scheduling Algorithms For your simulator, you will implement the following three CPU scheduling algorithms: 1. First Come, First Serve (FCFS) – Runnable processes are kept in a ready queue. FCFS is nonpreemptive; once a process begins running on a CPU, it will continue running until it either completes or blocks for I/O. The process with the earliest arrival time in the ready queue will be the next process selected. It is highly recommended to read the textbook section on FCFS before starting. 1 moved to the tail of the ready queue. It is highly recommended to read the textbook section on Round Robin before starting. 3. Preemptive Priority Scheduling with Aging – Processes with higher priority get to run first and processes with lower priority get preempted for a process with higher priority. There is a caveat, though. Our priority scheduler factors in the age of a process when determining priority. 1.2 Process States In our OS simulation, there are five possible states for a process. These states are listed in the process state t enum in os-sim.h: 1. NEW – The process is being created, and has not yet begun executing. 2. READY – The process is ready to execute, and is waiting to be scheduled on a CPU. 3. RUNNING – The process is currently executing on a CPU. 4. WAITING – The process has temporarily stopped executing, and is waiting for an I/O request to complete. 5. TERMINATED – The process has completed. There is a field named state in the PCB, which must be updated with the current state of the process. The simulator will use this field to collect statistics.Figure 1: Process States 1.3 The Ready Queue On most systems, there are a large number of processes that need to share the resources of a small number of CPUs. When there are more processes ready to execute than CPUs, processes must wait in the READY state until a CPU becomes available. To keep track of the processes waiting to execute, we keep a ready queue of the processes in the READY state. 1.4 Scheduling ProcessesNote that in a multiprocessor environment, we cannot mandate that the currently running process be at the head of the ready queue. There is an array (one entry for each CPU) that will hold the pointer to the PCB currently running on that CPU. 1.5 CPU Scheduler Invocation 1. yield() – A process completes its CPU operations and yields the processor to perform an I/O request. 2. wake up() – A process that previously yielded completes its I/O request, and is ready to perform CPU operations. wake up() is also called when a process in the NEW state becomes runnable. 4. terminate() – A process exits or is killed. 5. idle() – Waits for a new process to be added to the ready queue. idle() contains the code that gets executed by the idle process. In the real world, the idle process puts the processor in a low-power mode and waits. For our OS simulation, you will use a pthread condition variable to block the thread until a process enters the ready queue. 1.6 The Simulator We will use pthreads to simulate an operating system on a multiprocessor computer. We will use one thread per CPU and one thread as a ‘supervisor’ for our simulation. The supervisor thread will spawn new processes (as if a user started a process). The CPU threads will simulate the currently-running processes on each CPU, and the supervisor thread will print output. Since the code you write will be called from multiple threads, the CPU scheduler you write must be threadsafe! This means that all data structures you use, including your ready queue, must be protected using mutexes. The number of CPUs is specified as a command-line parameter to the simulator. For this project, you will be performing experiments with 1, 2, and 4 CPU simulations. Also, for demonstration purposes, the simulator executes much slower than a real system would. In the real world, a CPU burst might range from one to a few hundred milliseconds, whereas in this simulator, they range from 0.2 to 2.0 seconds.Figure 2: Simulator Function Calls The above diagram should give you a good overview of how the system works in terms of the functions being called and PCBs moving around. Below is a second diagram that shows the entire system overview and the code that you need to write is inside of the green cloud at the bottom. All of the items outside of the green cloud are part of the simulator and will not need to be modified by you.Figure 3: System overview Compile and run the simulator with ./os-sim 2. After a few seconds, hit Control-C to exit. You will see the output below:Figure 4: Sample Output The simulator generates a Gantt Chart, showing the current state of the OS at every 100ms interval. The leftmost column shows the current time, in seconds. The next three columns show the number of Running, Ready, and Waiting processes, respectively. The next two columns show the process currently running on each CPU. The rightmost column shows the processes which are currently in the I/O queue, with the head of the queue on the left and the tail of the queue on the right. As you can see, nothing is executing. This is because we have no CPU scheduler to select processes to execute! Once you complete Problem 1 and implement a basic FCFS scheduler, you will see the processes executing on the CPUs. 2 Problem 0: The Ready Queue We have provided simple implementations of queue t, enqueue(), dequeue(), and is empty() in student.c. The struct you have to implement will serve as your ready queue, and you should be using these helper functions to add and remove processes from the ready queue in the problems to follow. 2.1 Provided Queue • enqueue() will add a process to the ready queue. • dequeue() will find the next process to remove according to the scheduling algorithm, remove it from the queue, and return a pointer to it. • NOTE: When using the ready queue helper functions in the following problems, make sure to call them in a thread-safe manner. Read up on how to use mutex locks and lock/unlock the mutex for the ready queue when you call these functions. You might need to modify the enqueue() and dequeue() function to support the different scheduling algorithms as you move forward in the project. 3 Problem 1: FCFS Scheduler • Implement the yield(), wake up(), and terminate() handlers. in student.c.Checkout the hints in section 3.3, and note that preempt() is not necessary for this stage of the project. • Implement idle().idle() must wait on a condition variable that is signalled whenever a process is added to the ready queue.3.1 Hints • Be sure to update the state field of the PCB in all the methods above. The library will read this field to generate the RUNNING (Ru), READY (Re), and WAITING (Wa) columns, and to print the statistics at the end of the simulation. • context switch() takes a timeslice parameter, which is used for preemptive scheduling algorithms. Since FCFS is non-preemptive, use -1 for this parameter to give the process an infinite timeslice. • Make sure to use the helper functions in a thread-safe manner when adding and removing processes from the ready queue! • The current[] array should be used to keep track of the process currently executing on each CPU. Since this array is accessed by multiple CPU threads, it must be protected by a mutex. current mutex has been provided for you. 4 Problem 2: Round-Robin Scheduler Add Round-Robin scheduling functionality to your code. Alter the provided enqueue() and dequeue() to support round robin. You should modify main() to add a command line option, -r, which selects the Round-Robin scheduling algorithm, and accepts a parameter, the length of the timeslice. For this project, timeslices are measured in tenths of seconds. E.g.: ./os-sim -r 5 should run a Round-Robin scheduler with timeslices of 500 ms. While: ./os-sim should continue to run a FCFS scheduler. Note: you can use getopt(), which we used earlier in the semester or just parse the command line arguments passed into main using if statements. Implement preempt().To specify a timeslice when scheduling a process, use the timeslice parameter of context switch(). The simulator will simulate a timer interrupt to preempt the process and call your preempt() handler if the process executes on the CPU for the length of the timeslice without terminating or yielding for I/O. 5 Problem 3: Preemptive Priority Scheduling with Aging Add Priority with Aging scheduling to your code. Alter the provided enqueue() and/or dequeue() to support priority with aging. Modify main() to accept the -p parameter to select the Priority Scheduling with Aging algorithm. The command line argument will follow this format. ./os-sim -p . Take a look at your homework 4 if you are struggling with this. Implement the function priority with age(). Each process has a base priority, however, we need to factor in its age to give us the processes’ functional priority. To do this, we must understand a few variables. 1. current time is a running time function that tells us how long it has been since our simulator has been booted up. We can obtain this value by simply calling the function get current time(). 2. enqueue time is a value in the PCB that tells us when the process was put into the ready queue. We update this every time we enqueue a process. 3. age weight is an argument that is passed in from the command line. This value determines how much priority a process gains per unit age. To calculate our functional priority use the equation functional priority = base priority + (current time − enqueue time) ∗ age weight When a process is awakened, and all CPU cores are currently occupied by running processes, we preempt the lowest priority process if its priority is lower than that of the newly awakened process. Subsequently, we add this preempted process to the ready queue. NOTE: Contrary to modern operating systems, higher number means higher priority..Figure 5: Example Ready Queue The above ready queue is an example of our priority with aging algorithm. Our example simulator is at current time 15, has an age weight of .2, and four processes in its ready queue. Notice the following: • Process 1 is earlier in the queue than process 2, yet process 2 has a higher functional priority. • Process 4 has a higher base priority than process 3, yet process 3 has a higher functional priority. • Process 2 and 4 have the same base priority, yet process 2 has a much higher functional priority. 6 Problem 4: Short Answers Please write your answers to the following questions in answers.txt. 6.1 FIFO Scheduler Run your OS simulation with 1, 2, and 4 CPUs. Compare the total execution time of each. Is there a linear relationship between the number of CPUs and total execution time? Why or why not? Keep in mind that the execution time refers to the simulated execution time. 6.2 Round-Robin Scheduler Run your Round-Robin scheduler with timeslices of 800ms, 600ms, 400ms, and 200ms. Use only one CPU for your tests. Compare the statistics at the end of the simulation. Is there a relationship between the total waiting time and timeslice length? If so, what is it? In contrast, in a real OS, the shortest timeslice possible is usually not the best choice. Why not? 6.3 Preemptive Priority Scheduler Priority schedulers can sometimes lead to starvation among processes with lower priority. What is a way that operating systems can mitigate starvation in a priority scheduler? 6.4 Priority Inversion Consider a non-preemptive priority scheduler. Suppose you have a high-priority process (P1) that wants to display a window on the monitor. But, the window manager is a process with low priority and will be placed at the end of the ready queue. While it is waiting to be scheduled, new medium-priority processes are likely to come in and starve the window manager process. The starvation of the window manager will also mean the starvation of P1 (the process with high priority), since P1 is waiting for the window manager to finish running. If we want to keep our non-preemptive priority scheduler, what edits can we make to our scheduler to ensure that the P1 can finish its execution before any of the medium priority processes finish their execution? Explain in detail the changes you would make. 7 The Gradescope Environment You will be submitting files to Gradescope, where they will be tested in a VM setup that runs through Gradescope. The specifications of this VM are that it runs Ubuntu 20.04 LTS (64-bit) and gcc 9.3.0, and so we expect that your files can run in such a setup. This means that when you are running your project locally, you will want to ensure you are using a VM/some setup that runs Ubuntu 20.04 LTS (64-bit) and gcc 9.3.0; this way, you can ensure that what occurs locally is what will occur when you submit to Gradescope. IMPORTANT: Since we are dealing with different threads of execution, the result of each run in the simulation will be different. As a result, our gradescope autograder will accept a range of results for your simulation. In past semesters, we found that the range will start to increase the more computations you do in your enqueue/dequeue methods. If you find that you aren’t passing the autograder but you believe that your code is right, it is likely that you need to trim down your enqueue/dequeue methods. We are planning to start our autograder with a tighter acceptable range, and if required, we will increase it. 8 Deliverables NOTE: You need to upload student.c, and answers.txt to Gradescope, and an autograder will run to check if your scheduler is working. The autograder might take a couple of minutes to run. Remember to upload student.c, and answers.txt for every submission as your last submission would be one we will grade. Keep your answers detailed enough to cover the question, including support from simulator results if appropriate. Don’t write a book; but if you’re not sure about an answer, give us more information. 9 How to Run / Debug Your Code – Debugging deadlocks and synchronization errors 9.1 Running We have provided a Makefile that will run gcc for you. To compile your code with no optimizations (which you should do while developing, it will make debugging easier) and test with the FCFS algorithm and one CPU, run: $ make debug $./ os−sim 1 To run the other algorithms, run with the flags you implemented for round robin and priority. Remember that round robin requires you to enter a time slice. In case you encounter difficulties with Project 4 and are uncertain about the direction to take, various resources are available to assist you. 9.2 GDB Let us investigate how to debug deadlocks through a basic example:Following the execution and compilation of the code, it appears to become unresponsive. To investigate the root cause of this issue, it is recommended to utilize the GNU Debugger (gdb) to identify the problem:As anticipated, the program continues to remain unresponsive when run within the gdb environment. To interrupt the program, press Ctrl + c. To analyze the various threads associated with the program, utilize the ”info threads” command within gdb, which provides detailed information regarding each active thread:function. To obtain the backtrace of these threads, we can use the ”thread apply all” command along with the ”backtrace” command, which can be abbreviated as ”t a a bt”.The backtrace command confirms that threads 2 and 3 are indeed stuck at the pthread mutex lock function. To gain a more in-depth understanding of the specific thread’s state, we can utilize the gdb command ”thread [thread number]” to switch to a particular thread and examine its current state.By switching to thread 3 within gdb, we can identify the precise line of code where it has become deadlocked. Once we have identified the problematic line, we can utilize gdb’s features, such as printing values or switching stack frames, to investigate further and gain a better understanding of the issue at hand. Read the gdb thread documentation here for more information. 9.3 Valgrind (Helgrind or DRD)Lets run Helgrind with the command ’valgrind tool=helgrind ’. This is the result:Upon executing Valgrind’s tool Helgrind, we can observe that it has successfully identified an issue within the program where thread2 is accessing a shared variable without acquiring a corresponding lock. Additionally, DRD (another tool within Valgrind) also provides comparable output, albeit with fewer error lines. It is essential to rectify these issues to ensure proper synchronization and avoid potential data race conditions. To compare, here is the same program run with DRD (’valgrind tool=drd ’):Valgrind and DRD are also able to debug other types of synchronization errors. You can read the documentation about Helgrind here and DRD here. Credit to this video from an old class. 9.4 Tips and Tricks 1. When implementing enqueue() and dequeue(), think about all the possible edge cases. For example, how would you handle a case where both the head and tail are pointing to the same process? 2. When you dequeue, remember to set the next pointer of the removed process to null. 3. If you are deadlocking, utilize GDB. Follow the instructions above to understand the current state of the threads and pinpoint the issues. Think about whether you are properly locking and unlocking threads at the correct points in time. Utilize backtrace. 4. Make sure you are setting the state based on the handler/function. Take a look at Figure 1 to understand what state should be set when. 5. Make sure you are setting the enqueue time whenever you enqueue a process. 6. In wake up(), make sure your final decision on whether to force preempt is based on priority with age. 7. When unlocking/locking, don’t mix up current mutex and queue mutex. 8. Utilize your approach for parsing command line arguments in HW 4 for Part 4.

$25.00 View

[SOLVED] Cs2200 project 2

1 Introduction We have spent the last few weeks implementing our 32-bit datapath. The simple 32-bit LC-3300 is capable of performing advanced computational tasks and logical decision making. Now it is time for us to move on to something more advanced—the upgraded LC-3300a enables the ability for programs to be interrupted. Your assignment is to fully implement and test interrupts using the provided datapath and CircuitSim. You will hook up the interrupt and data lines to the new timer device, modify the datapath and microcontroller to support interrupt operations, and write an interrupt handler to operate this new device. You will also use the tiny, inexpensive LC-3300a as an embedded system to monitor a kitchen appliance. 2 Requirements Before you begin, please ensure you have done the following: • The LC-3300a assembler is written in Python. If you do not have Python 2.6 or newer installed on your system, you will need to install it before you continue. 3 What We Have Provided • A reference guide to the LC-3300a is located in Appendix A: LC-3300a Instruction Set Architecture. Please read this first before you move on! The reference introduces several new instructions that you will implement for this project. • A microcode file (microcode.xlsx) that meets the requirements of Project 1; however, feel free to supply your own. This microcode file has a new configuration with additional bits for the new signals that will be added in this project • A timer device that will generate an interrupt signal at regular intervals. The pinout and functionality of this device are described in Adding an External Timer Device. • A distance tracker that will generate an interrupt signal at regular intervals, and provides distance tracker readings. The pinout and functionality of this device are described in Adding a Distance Tracker. • An incomplete assembly program prj2.s that you will complete and use to test your interrupt capabilities. • An assembler with support for the new instructions to assemble the test program. • A completed LC-3300 datapath circuit (LC-3300a.sim) from Project 1 is provided. Use this as a base to add the basic interrupt support for the LC-3300a or build off of your own Project 1 datapath, but you must make sure the file is named LC-3300a.sim. Most of the work can be easily carried over from one datapath to another. 4 Phase 1 – Implementing a Basic InterruptFigure 1: Basic Interrupt Hardware for the LC-3300a Processor For this assignment, you will add interrupt support to the LC-3300a datapath. Then, you will test your new capabilities to handle interrupts using an external timer device. Work in the LC-3300a.sim file. If you wish to use your existing datapath, make a copy with this name, and add the devices we provided. 4.1 Interrupt Hardware Support First, you will need to add the hardware support for interrupts. You must do the following: 1. Our processor needs a way to turn interrupts on and off. Create a new one-bit “Interrupt Enable” (IE) register. You’ll connect this register to your microcontroller in a later step. 2. Create the INT line. The external device you will create in 4.2 will pull this line high (assert a ’1’) when they wish to interrupt the processor. Because multiple devices can share a single INT line, only one device can write to it at once. When a device does not have an interrupt, it neither pulls the line high nor low. You must accommodate this in your hardware by making sure that the final value going to the microcontroller always has a value (i.e. not a blue wire in CircuitSim). This can be done by using a specific gate to act like a pull-down resistor so that there is always a value asserted (See Appendix C for more information).. 3. When a device receives an IntAck signal, it will drive its 32-bit device ID onto the I/O Data Bus. To prevent devices from interfering with the processor, the I/O Data Bus is attached to the Main Bus with a tri-state driver. Create this driver and the bus, and attach the microcontroller’s DrDATA signal to the driver. 4. Modify the datapath so that the PC starts at 0x08 when the processor is reset. Normally the PC starts at 0x00, however we need to make space for the interrupt vector table (IVT). Therefore, when you actually load in the test code that you will write, it needs to start at 0x08. Please make sure that your solution ensures that datapath can never execute from below 0x08 – or in other words, force the PC to drive the value 0x08 if the PC is pointing in the range of the interrupt vector table. 5. Create hardware to support selecting the register $k0 within the microcode. This is needed by some interrupt related instructions. Because we need to access $k0 outside of regular instructions, we cannot use the Rx / Ry / Rz bits. HINT: Use only the register selection bits that the main ROM already outputs to select $k0. Notice that there is an unused input to the RegSel multiplexer. 4.2 Adding an External Timer Device Hardware timers are an essential device in any CPU design. They allow the CPU to monitor the passing of various time intervals, without dedicating CPU instructions to the cause. The ability of timers to raise interrupts also enables preemptive multitasking, where the operating system periodically interrupts a running process to let another process take a turn. Timers are also essential to ensuring a single misbehaving program cannot freeze up your entire computer. You will connect an external timer device to the datapath. It is internally configured to have a device ID of 0x0 and interrupt every 2000 clock ticks. • CLK: The clock input to the device. Make sure you connect this to the same clock as the rest of your circuit. • INT: The device will begin to assert this line when its time interval has elapsed. It will not be lowered until the cycle after it receives an INTA signal. • INTA IN: When the INTA IN line is asserted while the device has asserted the INT line, it will drive its device ID to the DATA line and lower its INT line on the next clock cycle. • INTA OUT: When the INTA IN line is asserted while the device does not have an interrupt pending, its value will be propagated to INTA OUT. This allows for daisy chaining of devices. • DATA: The device will drive its ID (0x0) to this line after receiving an INTA. The INT and DATA lines from the timer should be connected to the appropriate buses that you added in the previous section. 4.3 Microcontroller Interrupt Support Before beginning this part, be sure you have read through Appendix A: LC-3300a Instruction Set Architecture and Appendix B: Microcontrol Unit and pay special attention to the new instructions. However, for this part of the project, you do not need to worry about the LdDAR signal or the IN instruction. In this part of the assignment you will modify the microcontroller and the microcode of the LC-3300a to support interrupts. You will need to do the following: 1. Be sure to read the appendix on the microcontroller before starting this section. 2. Modify the microcontroller to support asserting four new control signals: (a) LdEnInt & EnInt to control whether interrupts are enabled/disabled. You will use these 2 signals to control the value of your interrupts enabled register. (b) IntAck to send an interrupt acknowledge to the device. (c) DrDATA to drive the value on the I/O Data Bus to the Main Bus. 3. Extend the size of the ROM accordingly. 4. Add the fourth ROM described in Appendix B: Microcontrol Unit to handle onInt. 5. Modify the FETCH macrostate microcode so that we actively check for interrupts. Normally this is done within the INT macrostate (as described in Chapter 4 of the book and in the lectures) but we are rolling this functionality in the FETCH macrostate for the sake of simplicity. You can accomplish this by doing the following: (a) First check to see if the CPU should be interrupted. To be interrupted, two conditions must be true: (1) interrupts are enabled (i.e., the IE register must hold a ’1’), and (2) a device must be asserting a ’1’ on the INT signal line. (b) If not, continue with FETCH normally. (c) If the CPU should be interrupted, then we enter the INT macrostate and perform the following: i. Save the current PC to the register $k0. ii. Disable interrupts. iii. Assert the interrupt acknowledge signal (IntAck). Next, drive the device ID from the I/O Data Bus and use it to index into the interrupt vector table to retrieve the new PC value. The device will drive its device ID onto the I/O Data Bus one clock cycle after it receives the IntAck signal. iv. This new PC value should then be loaded into the PC. Note: onInt works in the same manner that CmpOut did in Project 1. The processor should branch to the appropriate microstate depending on the value of onInt. onInt should be true when interrupts are enabled AND when there is an interrupt to be acknowledged. Note: The mode bit mechanism and user/kernel stack separation discussed in the textbook has been omitted for simplicity. 6. Implement the microcode for three new instructions for supporting interrupts as described in Chapter 4. These are the EI, DI, and RETI instructions. You need to write the microcode in the main ROM as well as the SEQ ROM for these three new instructions. Keep in mind that: (a) EI sets the IE register to 1. (b) DI sets the IE register to 0. (c) RETI loads $k0 into the PC, and enables interrupts. 4.4 Implementing the Timer Interrupt Handler Our datapath and microcontroller now support receiving interrupts from devices, BUT we must now implement the interrupt handler timer_handler within the prj2.s file to handle interrupts from the timer device in a way that doesn’t incorrectly interfere with any user programs. In prj2.s, we provide you with a modified version of pow.s that will run while you are waiting for interrupts. For this part of the project, you need to write the interrupt handler for the timer device (device ID 0x0). You should refer to Chapter 4 of the textbook to see how to write a correct interrupt handler. As detailed in that chapter, your handler will need to do the following: 1. First save the current value of $k0 (the return address to where you came from to the current handler) 2. Enable interrupts (which should have been disabled implicitly by the processor within the INT macrostate). 3. Save the state of the interrupted processor. 4. Implement the actual work to be done in the handler. In the case of this project, we want you to increment a counter variable in memory, which we have already provided. 5. Restore the state of the original processor and return using RETI. The handler you have written for the timer device should run every time the device interrupts the processor. Make sure to write the handler such that interrupts can be nested. With that in mind, interrupts should be disabled for as few instructions as possible within the handlers. You will need to do the following: 1. Write the interrupt handler (should follow the above instructions or simply refer to Chapter 4 in your book). In the case of this project, we want the interrupt handler to keep track of time in memory at the predetermined location: 0xFFFF 2. Load the starting address of the first handler you just implemented in prj2.s into the interrupt vector table at the appropriate addresses (the table is indexed using the device ID of the interrupting device). Test your design before moving onto the next section. If it works correctly, you should see the value at address 0xFFFF in memory increment as the program runs. 5 Phase 2 – Implementing Interrupts from Input DevicesFigure 2: Interrupt Hardware for the LC-3300a with Basic I/O Support Eager to put your newfound knowledge of device interrupts from CS2200 to good use, you decide to apply what you’ve learned to your engineering passion: a distance tracker! You are interested to know the maximum and minimum distance in that area. You’ve rigged up a device that is able to report the current distance measured to an LC-3300a processor via an interrupt. There’s only one issue: as of right now, your datapath can detect when an external device is ready to interrupt the processor, but it cannot retrieve data from external devices. In this phase of the project, you will add functionality for device-addressed input. You will then make use of this functionality by adding a device simulating a distance tracker and writing a simple handler for the device. 5.1 Basic I/O Support Before adding the distance tracker, you will first need to add support for device addressed I/O. In order to get input from a device such as a distance tracker, you will write a value to an Address Bus, which instructs the device with that address (which in this case is the same as the device ID) to write its output data to the I/O Data Bus. You must do the following: 1. Create the device address register (DAR) and connect its enable to the LdDAR signal from your microcontroller. This register gets its input from the Main Bus, and its output will be directly connected to the Address Bus. It will allow us to assert a value on the Address Bus while using the Main Bus for other operations. 2. Modify the microcontroller to support a new control signal, LdDAR. This signal will be used in order to enable writing to the DAR. 3. Implement the IN instruction in your microcode. This instruction takes a device address an immediate offset (IR[19:0]), loads it into the DAR, and writes the value on the data bus into a register. When it is done, it must clear the DAR (since interrupts use the data bus to communicate device IDs). Examine the format of the IN instruction and consider what signals you might raise in order to write a constant zero into the DAR. 5.2 Adding a Distance Tracker You will connect a distance tracker to your datapath that simulates a distance tracker by returning the current distance. Its internals are similar to the timer device, meaning it asserts interrupts and handles acknowledgments in the same way. Every 1000 cycles, it will assert an interrupt signaling that a distance value has been captured. This distance can be fetched as a 32-bit word by writing the device’s address to the ADDR line. The distance tracker is internally configured to have a device ID of 0x1. Place the distance tracker in your datapath circuit. This device will share the INT and DATA lines with the timer you added previously. However, it should receive its INTA signal from the INTA OUT pin on the timer device. This ensures that if both the timer and distance tracker raise an interrupt at the same time, the timer will be acknowledged first, and the distance tracker will be acknowledged after. This is known as “daisy chaining” devices. 5.3 Implementing the Distance Tracker Handler Now that your LC-3300a datapath can accept data from your distance tracker, we need to decide what to do with the data. In this case, we want to keep track of the maximum and minimum distance we have seen so far in two particular memory locations, 0xFFFD and 0xFFFC. You’ll have to implement this logic in your handler, which will work similarly to the one you wrote for the timer device. However, instead of incrementing a timer at a memory location, you will be keeping track of the maximum and minimum distance we have seen so far. You will also keep track of the range between the maximum and minimum distances we have seen so far. In addition to the usual overhead of an interrupt handler, your distance tracker handler must do the following: 1. Use the IN instruction to obtain the most recently captured distance value from the distance tracker. 2. Write the value obtained from the distance tracker to the memory location with the address 0xFFFD only if the value is greater than the current maximum or the address 0xFFFC only if the value is less than the current minimum. 3. Calculate the range and store it in address 0xFFFE. Make sure that you properly install the location of the new handler into the IVT. The distance tracker hardware is designed to emit a sequence of numbers representing distance readings. If your design is working properly, you should see the value stored in the memory location 0xFFFD increase and 0xFFFC decrease after a few thousand clock cycles as it updates when a new distance value is pushed onto the datapath. To validate you’re updating the distance expended correctly, you can check the values that the distance tracker will emit by inspecting the internals of the circuit and checking the values in the ROM labeled ‘Key Buffer’. 6 Autograder Similar to the autograder for Project 1, Project 2 autograder will execute your prj2.s using your datapath and simulate the interrupt handling process. It will tell you if your handler codes complete their jobs. Your final grade will not be determined by whether you pass the autograder or not. Feel free to use it as a tool to help you debug your circuit and assembly code, but you won’t be able to rely on it entirely. You must still figure out which part of your datapath/microcode/handler code is not functioning as expected. If you want to use the autograder, you must follow a few rules: • Don’t rename the components that already exist • Name your IE register as “IE” • Name your Interrupt ROM as “INT” • Use only one clock globally. • Use only one RAM as memory. • Don’t change the layout of the microcode Excel sheet • If you changed the constants in devices for debugging purposes, remember to change them back. We strongly recommend that you first test your project locally before submitting to Gradescope. For details on how to test locally, see Appendix D. 7 Deliverables To submit your project, you need to upload the following files to Gradescope: • CircuitSim datapath file (LC-3300a.sim) • Microcode file (microcode.xlsx) • Assembly code (prj2.s) If you are missing any of those files, you will get a 0, so make sure that you have uploaded all three of them. Always re-download your assignment from Gradescope after submitting to ensure that all necessary files were properly uploaded. If what we download does not work, you will get a 0 regardless of what is on your machine. This project will be demoed. In order to receive full credit, you must sign up for a demo slot and complete the demo. We will announce when demo times are released. 8 Appendix A: LC-3300a Instruction Set Architecture The LC-3300a is a simple, yet capable computer architecture. The LC-3300a combines attributes of both ARM and the LC-2200 ISA defined in the Ramachandran & Leahy textbook for CS 2200. The LC-3300a is a word-addressable, 32-bit computer. All addresses refer to words, i.e. the first word (four bytes) in memory occupies address 0x0, the second word, 0x1, etc. All memory addresses are truncated to 16 bits on access, discarding the 16 most significant bits if the address was stored in a 32-bit register. This provides roughly 64 KB of addressable memory. 8.1 Registers The LC-3300a has 16 general-purpose registers. While there are no hardware-enforced restraints on the uses of these registers, your code is expected to follow the conventions outlined below. Table 1: Registers and their Uses Register Number Name Use Callee Save? 0 $zero Always Zero NA 1 $at Assembler/Target Address NA 2 $v0 Return Value No 3 $a0 Argument 1 No 4 $a1 Argument 2 No 5 $a2 Argument 3 No 6 $t0 Temporary Variable No 7 $t1 Temporary Variable No 8 $t2 Temporary Variable No 9 $s0 Saved Register Yes 10 $s1 Saved Register Yes 11 $s2 Saved Register Yes 12 $k0 Reserved for OS and Traps NA 13 $sp Stack Pointer No 14 $fp Frame Pointer Yes 15 $ra Return Address No 1. Register 0 is always read as zero. Any values written to it are discarded. Regardless of what is written to this register, it should always output zero. 3. Register 2 is where you should store any returned value from a subroutine call. 4. Registers 3 – 5 are used to store function/subroutine arguments. Note: registers 2 through 8 should be placed on the stack if the caller wants to retain those values. These registers are fair game for the callee (subroutine) to trash. 5. Registers 6 – 8 are designated for temporary variables. The caller must save these registers if they want these values to be retained. 7. Register 12 is reserved for handling interrupts. 8. Register 13 is the everchanging top of the stack; it keeps track of the top of the activation record for a subroutine. 9. Register 14 is the anchor point of the activation frame. It is used to point to the first address on the activation record for the currently executing process. 10. Register 15 is used to store the address a subroutine should return to when it is finished executing. 8.2 Instruction Overview The LC-3300a supports a variety of instruction forms, only a few of which we will use for this project. The instructions we will implement in this project are summarized below. Table 2: LC-3300a Instruction Set 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0000 DR SR1 unused SR2 0001 DR SR1 unused SR2 0010 DR SR1 immval20 0011 DR BaseR offset20 0100 SR BaseR offset20 0101 SR1 SR2 offset20 0110 AT RA unused 0111 unused 1000 SR1 SR2 offset20 1001 DR unused PCoffset20 1010 DR SR1 unused 00 SR2 1010 DR SR1 unused 01 SR2 1010 DR SR1 unused 10 SR2 1010 DR SR1 unused 11 SR2 1011 SR unused 1100 unused 1101 unused 1110 unused 1111 DR 0000 addr20 ADD NAND ADDI LW SW BEQ JALR HALT BGT LEA SLL SRL ROL ROR FABS EI DI RETI IN 8.2.1 Conditional Branching Branching in the LC-3300a ISA is a bit different than usual. We have a set of branching instructions including BEQ, BGT, and FABS which offer the ability to branch upon a certain condition being met. These instructions use comparison operators, comparing the values of two source registers. If the comparisons are true (for example, with the BGT instruction, if SR1 > SR2), then we will branch to the target destination of incrementedPC + offset20. For FABS, if SR < 0 then we will branch to the series of microstates for negation. 8.3 Detailed Instruction Reference 8.3.1 ADD Assembler Syntax ADD DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0000 DR SR1 unused SR2 Operation DR = SR1 + SR2; Description The ADD instruction obtains the first source operand from the SR1 register. The second source operand is obtained from the SR2 register. The second operand is added to the first source operand, and the result is stored in DR. 8.3.2 NAND Assembler Syntax NAND DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0001 DR SR1 unused SR2 Operation DR = ~(SR1 & SR2); Description The NAND instruction performs a logical NAND (AND NOT) on the source operands obtained from SR1 and SR2. The result is stored in DR. 8.3.3 ADDI Assembler Syntax ADDI DR, SR1, immval20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0010 DR SR1 immval20 Operation DR = SR1 + SEXT(immval20); Description The ADDI instruction obtains the first source operand from the SR1 register. The second source operand is obtained by sign-extending the immval20 field to 32 bits. The resulting operand is added to the first source operand, and the result is stored in DR. 8.3.4 LW Assembler Syntax LW DR, offset20(BaseR) Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0011 DR BaseR offset20 Operation DR = MEM[BaseR + SEXT(offset20)]; Description An address is computed by sign-extending bits [19:0] to 32 bits and then adding this result to the contents of the register specified by bits [23:20]. The 32-bit word at this address is loaded into DR.8.3.5 SW SW SR, offset20(BaseR) Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0100 SR BaseR offset20 Operation MEM[BaseR + SEXT(offset20)] = SR; Description An address is computed by sign-extending bits [19:0] to 32 bits and then adding this result to the contents of the register specified by bits [23:20]. The 32-bit word obtained from register SR is then stored at this address. 8.3.6 BEQ Assembler Syntax BEQ SR1, SR2, offset20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0101 SR1 SR2 offset20 Operation if (SR1 == SR2) { PC = incrementedPC + offset20 } Description A branch is taken if SR1 is equal to SR2. If this is the case, the PC will be set to the sum of the incremented PC (since we have already undergone fetch) and the sign-extended offset[19:0]. 8.3.7 JALR JALR AT, RA Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0110 AT RA unused Operation RA = PC; PC = AT; Description First, the incremented PC (address of the instruction + 1) is stored into register RA. Next, the PC is loaded with the value of register AT, and the computer resumes execution at the new PC. 8.3.8 HALT Assembler Syntax HALT Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 0111 unused Description The machine is brought to a halt and executes no further instructions. 8.3.9 BGT Assembler Syntax BGT SR1, SR2, offset20 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1000 SR1 SR2 offset20 Operation if (SR1 > SR2) { PC = incrementedPC + offset20 } Description A branch is taken if SR1 is greater than SR2. If this is the case, the PC will be set to the sum of the incremented PC (since we have already undergone fetch) and the sign-extended offset[19:0]. 8.3.10 LEA LEA DR, label Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1001 DR unused PCoffset20 Operation DR = PC + SEXT(PCoffset20); Description An address is computed by sign-extending bits [19:0] to 32 bits and adding this result to the incremented PC (address of instruction + 1). It then stores the computed address into register DR. 8.3.11 SLL Assembler Syntax SLL DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 00 SR2 Operation DR = SR1 > SR2; Description The value stored in SR1 is logically right shifted by the value stored in SR2, and the result is stored in DR. 8.3.13 ROL ROL DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 10 SR2 Operation DR = (SR1 > (32 – SR2)); Description Bits in SR1 are ”rotated” left by SR2 number of bits using circular shifting. During a left rotation, the bits that are shifted out from the left are brought back in on the right side. 8.3.14 ROR Assembler Syntax ROR DR, SR1, SR2 Encoding 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 1010 DR SR1 unused 11 SR2 Operation DR = (SR1 >> SR2) | (SR1

$25.00 View