Please follow the below steps to test your Project-2. Download the zip of this git repository. Unzip the repository. Implement your kernel module code in the memory_manager.c (i.e. all //TODO part, DO NOT change any other part). Run the test.sh script with the test arguments. Test Cases: ./test.sh 2 | The WSS results from your kernel module output should be 300 MB (5 pts), 200 MB (5 pts), and 100 MB (5 pts), and all RSS results should keep as 300 MB (5 pts) | 20 | | 3 | sudo ./test.sh 3 | Bonus! The RSS + SWAP results from your kernel module output should match memory pressure. For iteration 1, SWAP size should be 0 (5 pts). For iteration 3,SWAP size should be larger than 0 (5 pts). | 1 extra point of your CSE330 final total grades | Note: Sample Output Screenshots: Test Case 0Test Case 1Test Case 2Test Case 3
Download the zip of this git repository. Unzip the repository. Implement your kernel module code in the template (producer_consumer.c). Run the test.sh script with the test arguments. bash test.sh < Number of processes to be spawned> < Buffer Size > < Number of Producer > < Number of Consumer > < Number of lines to be displayed from the dmesg > Test Cases: ./test.sh 10 5 1 0 25 | The number of items produced will be equal to the buffer size 5. (3 pts) Module should exit without error. (2 pts) | 5| | 2 | sudo ./test.sh 10 5 0 1 25 | Consumer should consume nothing (2 pts) Module should exit without error (3 pts) | 5| | 3 | sudo ./test.sh 10 50 1 1 25 | The total number of produced items should be equal to 10. Each process should be produced and consumed only once. (4 pts) CPU time should match with ps command. (4 pts) Module should exit without error. (2 pts) | 10 | | 4 | sudo ./test.sh 100 50 1 1 25 | The total number of produced items should be equal to 100. Each process should be produced and consumed only once. (6 pts) CPU time should match with ps command. (6 pts) Module should exit without error. (3 pts) | 15 | | 5 | sudo ./test.sh 1000 50 1 1 100 | The total number of produced items should be equal to 1000. Each process should be produced and consumed only once. (6 pts). CPU time should match with ps command. (6 pts) Module should exit without error. (3 pts) | 15 | a) Bonus Test Case: – If your code also passes the Bonus Test Case Module should exit without error | 10 | Note: Please do not make any changes in provided test code to pass the test cases. Sample Output Screenshots: Test Case 1Test Case 2Test Case 3Test Case 4Test Case 5 Test Case 6
Polytime Reductions, Self-Reductions, and PS Instance: A matrix M and an integer k. Question: We’re allowed to permute the rows and columns of M as we like, and we want to do so in a way that minimizes the number of single-number rectangular “blocks” needed to describe the matrix. E.g., is we were to write (a,b,c,d;137), where 1 6 a 6 c 6 m and 1 6 b 6 d 6 n , then we would know that every entry: cstutorcsMi,j for a 6 i 6 c and b 6 j 6 d would have the number 137. Our question is, can we permute the rows and columns of M in such a way that we need no more than k blocks to store it? As an example, ifM is 1 0 1 2 2 1 2 1 then we can permute the second and third row and the first and second columns to get . We need four blocks to describe the matri we get from this permutation: 0 1 1 2 1 1 , 2 2 2,, 2 2 2 and . Notice that we allow the third and fourth blocks to overlap, since they store the same number. You will also find that we can’t use fewer than four blocks to store this matrix, regardless of the permutation. So if I give you M and k for k > 4, it will be yes-instance. If I give you a k < 4 it will be a noinstance. You can assume this language is in NP, but you have to show that it is NP-complete (so you have to show that it’s NP-hard). With this in mind, give a reduction that shows 程序代写代做 CS 编程辅 MIN-BLOCKS is NP-hard. Reduce from some 导 language in the Polytime Reduction Examples 1 or the Polytime Reduction Examples 2 handout. Instance: A 3CNF φ and a satisfying truth assignment τ for φ.Question: Does φ have a second satisfying truth assignment τ0 6= τ such that τ and τ0 agree on at least 7/8 of their variable assignments? You can assume this language is in NP, but you have to show that it is NP-complete (so you have to show that it’s NP-hard). With this in mind, give a proof that CLOSE-SOLUTION-3SAT is NP-hard. 3. Consider the language PATH-CONSTRAINT defined as: Instance: A directed graph G = (V,E), two vertices s and t, and two numbers r and k. Question: Can we find a size-: cstutorcsr subset R of vertices in V such that if PR is the longest simple path from s to t using only vertices in R, then the number of vertices in PR is somewhere between 2 and k − 1? Let φ = ∀x1,x2∃x3,∀x4(¬(¬x1∨¬x2) ∨ x2∨¬(¬x1∧ (x3∨ x4))) ∧ x3∧ (¬x1∨ x4). Draw f(φ) (do not simplify the formula before applying f). Label your graph appropriately and indicate the starting node. You’ve already given a proof that this language is NP-complete, but we can also ask how, given an oracle for STORAGE-BOX, to find a solution for a particular STORAGE-BOX instance. Show how you can use an oracle for STORAGE-BOX to build a polytime oracle program FINDSTORAGEBOXES such that for any instance of STORAGE-BOX: If any solution exists for that instance, FIND-STORAGE-BOXES will return some such solution. If no such solution exists, FIND-STORAGE-BOXES will return null. Justify why your program is correct and runs in polytime. You’ve already been given a proof that this language is NP-complete, but we can also ask how, given an oracle for FEEDBACK-ARC-SET, to find an optimal feedback arc set for a particular directed graph G. Show how you can use an oracle for FEEDBACK-ARC-SET to build a polytime oracle program MINIMIZEFEEDBACK-ARC such that takes as input any directed graph G and returns a smallest feedback arc set forG. Justify why your program is correct and runs in polytime. Modify the NP-hardness proof for 3DM to make it a parsimonious reduction from #3SAT. Modify the NP-hardness proof for FEEDBACK-ARC-SET to make it a parsimonious reduction. Consider the SPIRAL-GALAXIES reduction described in the Polytime Reduction Examples 2 handout. Note: This handout won’t be out at the time Assignment 3 is posted, but it will be up soon.: In this reduction we encode a number of widgets that represent wires, inputs, and-gates, and other circuit components, and we use those components to build a representation of a 3CNF φ. But in that reduction we miss one component: we don’t show a way to bend the wired we build, and so we re forced to use a workaround in which the circuit wires always face in the same direction and move laterally by shifting. Technically we’d also need to modify the crossover widget to get this to work as well, but this is the interesting part. To answer this question, find a way to encode a bent wire (using the endcoding from the reduction) into a SPIRAL-GALAXIES game. Carefully explain why your widget works.
1. Producer Thread The producer thread searches the system for all the processes that belong to a given user (In test scripts, we will automatically create a new user: test_cse330) and adds their task_struct to the shared buffer. There is only one producer in the system, and it exits after it has iterated through the entire task list. The kernel stores all the processes in a circular doubly linked list called task list. Each element in the task list is a process descriptor of the type struct task_struct which contains all the information about a process. The below figure shows the process descriptors and task list. We use the for_each_process macro to iterate through the task list to access each task_struct. The macro goes over all the task_struct in the task list one by one. The task_struct contains the process PID in pid and the process users UID in cred->uid.val for_ each process(struct task struct p) II On each iteration, P points to the next task in the list task struct task; task›pid // PID of the process task-›cred-›uid.val // UID of the user of the process struct task struct. struct task struct struct task struct struct task_struct unsigned long state; int prio; unsigned long policy: struct task_struct parent; struct list head tasks; pid_t pid; the task list The following code example calculates the number of processes in the task list. include‹linux/sched.h> #include struct task_struct* p; size_t process_counter = 0; for_each_process(P) { process_counter; CSE330 Project 4 Bounded buffer Producer Consumer Problem
Introduction In this assignment, you will explore the implementation of a particular file system, ext2, and will write tools to modify ext2-format virtual disks. To do this work, you will need to be comfortable working with binary data and will need to learn about the ext2 filesystem. implementedEmail: correctly. For implementing any additional functionality which is not specified in the handout, or if you are unsure whether to handle some inode parameters in specific cases (after having read the documentation!), please ask on the discussion board. Requirements Your task is to write a set of programs (in C) that operate on an ext2 formatted virtual disk. The executables must be named exactly as listed below, and must take the specified arguments. ext2_mkdir: This program takes two command line arguments. The first is the name of an ext2 formatted virtual disk. The second is an absolute path on your ext2 formatted disk. The program should work like mkdir, creating the final directory on the specified path on the disk. If any component on the path to the location where the final directory is to be created does not exist or if the specified directory already exists, then your program should return the appropriate error (ENOENT or EEXIST). Again, please read the specifications to make sure you’re implementing everything correctly (e.g., directory entries should be aligned to 4B, entry names are not nullterminated, etc.). ext2_ln: This program takes three command line arguments. The first is the name of an ext2 formatted virtual disk. The other two are absolute paths on your ext2 formatted disk. The program should work like ln, creating a link from the first specified file to the second specified path. This program should handle any exceptional circumstances, for example: if the source file does not exist “-s” flag, after the disk image argument. When this flag is used, your program must create a symlink instead (other arguments remain the same). If in doubt about correct operation of links, use the ext2 specs and ask on the discussion board. ext2_rm: This program takes two command line arguments. The first is the name of an ext2 formatted virtual disk, and the second is an absolute path to a file or link (not a directory) on that disk. The program should work like rm, removing the specified file from the disk. If the file does not exist or if it is a directory, then your program should return the appropriate error. Once again, please read the specifications of ext2 carefully, to figure out what needs to actually happen when a file or link is removed (e.g., no need to : cstutorcszero out data blocks, must set i_dtime in the inode, removing a directory entry need not shift the directory entries after the one being deleted, etc.). BONUS: Implement an additional “-r” flag (after the disk image argument), which allows removing directories as well. In this case, you will have to recursively Assignment Project Exam Helpremove all the contents of the directory specified in the last argument. If “-r” is used with a regular file or link, then it should be ignored (the ext2_rm operation should be carried out as if the flag had not been entered). If you decide to do the bonus, make sure first that your ext2_rm works, then [email protected] a new copy of it and rename it to ext2_rm_bonus.c, and implement the additional functionality in this separate source file. Hint: The file to be restored will not appear in the directory entries of the parent directory, unless you search the “gaps” left when files get removed. The directory entry structure is the key to finding out these gaps and searching for the removed file. Note(3): We will not try to recover files that had hardlinks at the time of removal. This is because when trying to restore a file, if its inode is already in use, there are two options: the file we’re trying to restore previously had other hardlinks (and hence its inode never really got invalidated), _or_ its inode has been re-allocated to a completely new file. Since there is no way to tell between these 2 possibilities, recovery in this case should not be attempted. BONUS: Implement an additional “-r” flag (after the disk image argument), which allows restoring directories as well. In this case, you will have to recursively restore all the contents of the directory specified in the last argument. If “-r” is used with a regular file or link, then it should be ignored (the restore operation should be carried out as if the flag had not been entered). If you decide to do the bonus, make sure first that your ext2_restore works, then create a new copy of it and rename it to ext2_restore_bonus.c, and implement the additional functionality in this separate source file. ext2_checker: This program takes only one command line argument: the name of an ext2 formatted virtual disk. The program should implement a lightweight file system checker, which detects a small subset of : cstutorcspossible file system inconsistencies and takes appropriate actions to fix them (as well as counts the number of fixes), as follows: a. the superblock and block group counters for free blocks and free inodes must match the number of free inodes and data blocks as indicated in the respective bitmaps. Assignment Project Exam HelpIf an inconsistency is detected, the checker will trust the bitmaps and update the counters. Once such an inconsistency is fixed, your program should output the following message: “Fixed: X’s Y counter was oEmail: [email protected] by Z compared to the bitmap”, where X stands for either “superblock” or “block group”, Y is either “free blocks” or “free inodes”, and Z is the difference (in absolute value). The Z values should be added to the total number of fixes. b. for each file, directory, or symlink, you must check if its inode’s i_mode matches the directory entry file_type. If it does not, then you shall trust the inode’s i_mode and fix the file_type to match. Once such an inconsistency is repaired, your program should output the following message: “Fixed: Entry type vs inode mismatch: inode [I]”, where I is the inode number for thehttps://tutorcs.com respective file system object. Each inconsistency counts towards to total number of fixes. c. for each file, directory or symlink, you must check that its inode is marked as allocated in the inode bitmap. If it isn’t, then the inode bitmap must be updated to indicate that the inode is in use. You should also update the corresponding counters in the block group and superblock (they should be consistent with the bitmap at this point). Once such an inconsistency is repaired, your program should output the following message: “Fixed: inode [I] not marked as in-use”, where I is the inode number. Each inconsistency counts towards to total number of fixes. d. for each file, directory, or symlink, you must check that its inode’s i_dtime is set to 0. If it isn’t, you must reset (to 0), to indicate that the file should not be marked for removal. Once such an inconsistency is repaired, your program should output the following message: “Fixed: valid inode marked for deletion: [I]”, where I is the inode number. Each inconsistency counts towards to total number of fixes. e. for each file, directory, or symlink, you must check that all its data blocks are allocated in the data bitmap. If any of its blocks is not allocated, you must fix this by updating the data bitmap. You should also update the corresponding counters in the block group and superblock, (they should be consistent with the bitmap at this point). Once such an inconsistency is fixed, your program should output the following message: “Fixed: D in-use data blocks not marked in data bitmap for inode: [I]”, where D is the number of data blocks fixed, and I is the inode number. Each inconsistency counts towards to total number of fixes. Your program must count all the fixed inconsistencies, and produce one last message: either “N file system inconsistencies repaired!”, where N is the number of fixes made, or “No file system inconsistencies detected!”. All of these programs should be minimalist. Don’t implement what isn’t specified: only provide the required functionality and specified errors. (For example, don’t implement wildcards. Also, can’t delete directories? Too bad! Unless you want the bonus! 🙂 You will find it very useful for these programs to share code. You will want a function that performs a path walk, for example. You will also want a function that opens a specific directory entry : and writes to it. To help you visualize your file system, we are giving you an already built tool, called ext2_ls (ext2_ls). This program takes two command line arguments. – The first is the name of an ext2 formatted virtual disk. – The second is an absolute path on the ext2 formatted disk. The program works like ls -1a (that’s number one “1”, not lowercase letter “L”): it prints one line for every directory entry (including “.” and “..”) from the directory specified [email protected] the absolute path. If the path does not exist, it prints “No such file or directory”. If the path is a file or link, the tool simply prints the full path on a single line. We are also giving you a tool called ext2_dump (ext2_dump), which dumps all the raw information about the image contents. This is very similar to the readimage program that you have to implement for the tutorial exercises that give you practice with ext2 images. Once again, we encourage you to work on thehttps://tutorcs.com tutorial exercises first, to gain experience with extracting various bits of information from an ext2 image. We are also giving you a tool called ext2_corruptor (ext2_corruptor), which corrupts file system images, introducing various inconsistencies like the ones that you have to fix. This tool has limited capabilities, and is solely to help you with basic testing. You are welcome to develop your own corruptor tool as well. Finally, to help you in determine some basic correctness, we are giving you some sample test cases: running a set of commands and their expected outputs (image dumps). These self-tester test cases are found on the teaching labs under: /u/csc369h/winter/pub/public/A4-self-test Learning about the Filesystem Here are several sample virtual disk images: emptydisk (./images/emptydisk.img): An empty virtual disk. onefile (./images/onefile.img): A single text file has been added to emptydisk. deletedfile (./images/deletedfile.img): The file from onefile has been removed. onedirectory (./images/onedirectory.img): A single directory containing a text file has been added to emptydisk. hardlink (./images/hardlink.img): A hard link to the textfile in onedirectory was added. deleteddirectory (./images/deleteddirectory.img): A recursive remove was used to remove the directory and file from onedirectory. twolevel (./images/twolevel.img): The root directory contains a directory called level1 and a file called afile. level1 contains a directory called level2, and level2 contains a file called bfile. twolevelcorrupt (./images/twolevel-corrupt.img): Same as twolevel, except that the image contains file system inconsistencies that you will have to repair with the checker. twolevel-norestore-afile (./images/twolevel-norestore-afile.img): Same as twolevel, except that /afile has been removed, and its inode number has been reused. This image can be used for testing the case when a file cannot be restored. largefile (./images/largefile.img): A file larger than 13KB (13440 bytes) is in the root directory. This file requires the single indirect block in the inode. These disks were each created and formatted in the same way (on an ubuntu virtual machine): % dd if=/dev/zero of=~/DISKNAME.img bs=1024 count=128 % mke2fs -N 32 DISKNAME.img % sudo mount -o loop ~/DISKNAME.img /home/bogdan/mntpoint : % cd /home/bogdan/mntpoint % …… normal linux commands to add/remove files/directories/links ….. % cd ~ % umount /home/bogdan/mntpoint Since we are creating images with mke2fs, the disks are formatted with the [email protected] file system The wikipedia page for ext2 (http://en.wikipedia.org/wiki/Ext2) provides a good overview, but the Ext2 wiki (http://wiki.osdev.org/Ext2) and Dave Poirer’s QQ: 749389476Second Extended File System (http://www.nongnu.org/ext2doc/index.html) article provide more detail on how the system places data onto a disk. It’s a good reference to keep on hand as you explore. We are restricting ourselves to some simple parameters, so you can make the following assumptions whenhttps://tutorcs.com you write your code: A disk is 128 blocks where the block size is 1024 bytes. There is only one block group. There are 32 inodes. You do not have to worry about permissions or modified time fields in the inodes. You should set the type (in i_mode), i_size, i_links_count, i_blocks(disk sectors), and the i_block array. We will not test your code on anything other than disk images that follow this specification, or on corrupted disk images. Other tips: Inode and disk block numbering starts at 1 instead of 0. The root inode is inode number 2 (at index 1) The first 11 inodes are reserved. There is always a lost+found directory in the root directory. Disk sectors are 512 bytes. (This is relevant for the i_blocks field of the inode.) You should be able to handle directories that require more than one block. You should be able to handle a file that needs a single indirection Although you can construct your own structs from the information in the documentation above, you are welcome to use the ext2.h (./ext2.h) file that I used for the test code. I took out a bunch of components that we aren’t using, but there are still quite a few fields that are irrelevant for our purposes. However, you will probably also want to explore the disk images to get an intuitive sense of how they are structured. (The next three exercises will also help you explore the disk images and get started on the assignment.) There are two good ways to interface with these images. The first way is to interact with it like a user by mounting the file system so that you can use standard commands (mkdir, cp, rm, ln) to interact with it. Details of how to do this are below. The second way is to interact with the disk as if it is a flat binary file. Use xxd to create hex dumps, diff to look for differences between the dumps, and your favorite text editor to view the diffs. For example (YMMV): % diff
Introduction In this assignment, you will have to simulate the operation of page tables and page replacement. As I keep saying: the way to gain a solid understanding of the theory is by applying the concepts in practice. You have two tasks in this assignment, which will be based on a virtual memory simulator. The first task is to implement virtual-to-physical address translation and demand paging using a two-level page table. : The second task is to implement four different page replacement algorithms: FIFO, Clock, exact LRU, and OPT. Before you start work, you should complete the set of readings about memory, Assignment Project Exam Helpif you haven’t done so already: Paging: Introduction (http://pages.cs.wisc.edu/~remzi/OSTEP/vm-paging.pdf)Requirements Setup You will find the starter code HERE (starter.tar.gz). It is your responsibility to add the code in your repository and make sure that you submit all the necessary files! DO NOT MANUALLY CREATE A NEW DIRECTORY ‘A3’ IN YOUR REPOSITORY! An empty directory called ‘A3’ should be created for you automatically when you log into MarkUs and go on your A3 link. Just do an svn update to see the newly-created ‘A3’ directory in your repository. As usual, please make sure in advance that you can access your A3 directory, to avoid last-minute surprises. Compile the trace programs and generate the traces. Task 1 – Address Translation and PagingImplement virtual-to-physical address translation and demand paging using a two-level pagetable. The main driver for the memory simulator, sim.c, reads memory reference traces in the format produced by the fastslim.py tool from valgrind memory traces. For each line in the trace, the program asks for the simulated physical address that corresponds to the given virtual address by calling find_physpage, and then reads from that location. If the access type is a write (“M” for modify or “S” for store), it will also write to the location. You should read sim.c so that you understand how it works but you should not have to modify it.. The simulator is executed as ./sim -f -m -s -a where memory size and swapfile size are the number of frames of simulated physical memory and the number of pages that can be stored in the swapfile, respectively. : The swapfile size should be as large as the number of unique virtual pages in the trace, which you should be able to determine easily. There are four main data structures that are used: 1. char *physmem: This is the space for our simulated physical memory. We define a simulated page size (and hence frame size) of SIMPAGESIZE and allocate SIMPAGESIZE * “memory size” bytes forEmail: physmem. 2. struct frame *coremapQQ: 749389476: The coremap array represents the state of (simulated) physical memory. Each element of the array represents a physical page frame. It records if the physical frame is in use and, if so, a pointer to the page table entry for the virtual page that is using it. 3. pgdir_entry_t pgdir[PTRS_PER_PGDIR]: We are using a two-level page table design; the toplevel is referred to as the page directory, which is represented by this array. Each page directory entry (pde_t) holds a pointer to a second-level page table (which we refer to simply as page tables, for short). We use the low-order bit in this pointer to record whether the entry is valid or not. The page tables are arrays of page table entries (pte_t), which consist of a frame number if the page is in (simulated) physical memory and an offset into the swap file if the page has been written out to swap. The format of a page table entry is shown here: Note that the frame number and status bits share a word, with the low-order PAGE_SHIFT bits (12 in our implementation) used for status (we only have 4 status bits, but you can add more if you find it useful). Thus, for a given physical frame number (e.g. 7), remember to shift it over to leave room for the status bits (e.g., 7 frame >> PAGE_SHIFT). To complete this task, you will have to write code in pagetable.c. Read the code and comments in this file — it should be clear where implementation work is needed and what it needs to do. 程序代写代做 CS 编程辅导 The rand replacement algorithm is already implemented for you, so you can test your translation and paging functionality independently of implementing the replacement algorithms. Task 2 Using the starter code, implement each of the four different page replacement algorithms: FIFO, exact LRU, CLOCK (with one ref-bit), OPT. You will find that you want to add fields to the struct frame for the different page replacement algorithms. You can add them in pagetable.h, but please label them clearly. Once you’re done implementing the algorithms, run all three programs : from the provided traceprogs, plus a fourth program of your choosing with interesting memory reference behaviour, using each of your algorithms (include rand as well). For each algorithm, run the programs on memory sizes 50, 100, 150, and 200. Use the data from these runs to Assignment Project Exam Helpcreate a set of tables that include the following columns. (Please label your columns in the following order,) Hit rate Hit count Miss count Overall eviction count Clean eviction count Dirty eviction count Write up Include a file called README.txt or README.pdf that includes the following information. The tables prepared in Task 2 One paragraph comparing the various algorithms in terms of the results you see in the tables. A second paragraph explaining the data you obtained for LRU as the size of memory increases. Marking Scheme Task 1: 35% Task 2: FIFO 5% LRU 10% CLOCK 10% OPT 15% (must be able to run all traces in a reasonable amount of time) Tables 5% Comparison paragraph 5% LRU description 5% Program readability and organization 10% Submission Additionally, you must add and commit an INFO.txt file, which contains as the first 3 lines the following: your name your UtorID Aside from this, please feel free to describe problems you’ve encountered, what isn’t fully implemented (or doesn’t work fully), any special design decisions you’ve taken, etc. : Make sure your code compiles without any errors or warnings. Make sure you have included all the files needed to correctly run your programs. Make sure you have included the INFO.txt file and the correct revision in it. As previously, to check that your assignment submission is complete, please do the following:1. create an empty temporary directory in your cdf account (not in a subdirectory of your repo) 2. check out a copy of your repository for this assignment 3. verify that your README and INFO.txt files are includedQQ: 749389476 4. run make and ensure that you are able to build sim without any errors or warnings (This is an excellent way to verify that all files have been committed to the repo.) 5. run a few tests using the same traces you used to create the tables in your README, to ensure that your code behaves as you expect 6. congratulate yourself and enjoy a well-earned break, knowing that your strategy and hard work will pay off!powered by Jekyll-Bootstrap-3 (http://jekyll-bootstrap-3.github.io/preview/#/theme/Bootstrap) and Twitter Bootstrap 3.0.3 (http://getbootstrap.com)
It was such a relief to program in user mode for a change. — Linus Torvalds, to Git mailing list Overview For this assignment, we’re going back to the realm of user mode programming. In this assignment, you will work with synchronization primitives to implement a traffic monitoring and guidance system. Autonomous (Self-driving) cars are increasingly becoming more reliable, as development of smart technologies help better guide the vehicle through its environment safely. Autonomous cars : depend on several sensors and complex logic to coordinate with other vehicles in a safe manner. At an intersection for example, cars must have a policy to coordinate crossing the intersection in order to avoid crashes and to ensure that everyone goes through the intersection eventually. Your job is to implement a traffic system that coordinates cars going through an intersection, by enforcing ordering between car arrivals in the same direction, avoiding potential crashes, and avoiding deadlocks. Details Consider the structure of the intersection, as shown in the figure below. There are four entrances/exits to the intersection, each corresponding to a cardinal direction: North (N), South (S), East (E) and West (W). Each entrance into the intersection is referred to as a lane and is https://tutorcs.comrepresented by a fixed sized buffer. Exits from the intersection are represented by a simple linked list for each exit direction (see starter code (starter_code.tar.gz) ). The intersection itself is broken into quadrants, each represented by a lock. : Cars are represented by a 3-tuple (id, in_dir, out_dir) where id is a car’s unique id, in_dir is the direction from which the car enters the intersection and [email protected]_dir is the direction the car leaves the intersection. The traffic system enforces the policy for cars to pass through the intersection safely and in an orderly fashion. You will implement the traffic system using a monitor. Policies: 1. Once a car approaches an intersection, it is placed in the corresponding lane. The car can only cross theintersection when all other cars already in the lane have crossed. 2. Once a car is the first in the lane, it can begin to cross the intersection. Before the car can enter theintersection it must guarantee that it has a completely clear path to its destination. Implementation Your task is to implement the methods for the monitor, given to you in the starter code (starter_code.tar.gz). The monitor has 3 methods that you must fill: /* Initialize all the monitor attributes */ void init_intersection(); /* Car approaches the intersection, must be added to the lane */ void *car_arrive(void *arg); /* Car enters the intersection, must check for clear path */ void *car_cross(void *arg); Additionally, the monitor contains the helper function below (which you must also fill in), and which will assist in making decisions about how cars cross the intersection. /* given two directions returns the path a car would take through the intersection */ int *compute_path(enum direction in_dir, enum direction out_dir); There are two threads per cardinal direction, one which executes car_arrive() and another which executes car_cross() You must use the starter code provided, which gives you detailed instructions on what you need to implement. Please make sure to implement all the parts indicated using detailed TODO comments. The starter code contains a program (traffic.c) that serves as the entry point for the assignment. The traffic program takes the configuration of the approaching cars from a file, which contains a series of rows each corresponding to a car. Each row is formatted as follows: id, in_dir, out_dir ex. 1 0 2 : 2 1 2 3 1 3 4 0 1 You must ensure that there are no memory leaks in your code. DO NOT free the out_cars array because we will be inspecting its contents in the autotester, to verify some integrity constraints regarding the correctness of your implementation. TestingOne of valgrind’s tools is called helgrind, a synchronization error checker (for the lack of a better word), which is tasked with helping developers identify synchronization bugs in programs that use POSIX pthreads primitives. You are encouraged to read through the Documentation (http://valgrind.org/docs/manual/hgmanual.html) for helgrind, in order to understand the kinds of synchronization errors it can detect and how each of these are reported. The simplest example on how to use helgrind is as follows: (http://www.cdf.toronto.edu/~csc369h/winter/lectures/w4/prodcons_condvar.c) As you can see, no synchronization errors are reported, and you should get a report that ends similarly to the one below:==9395== ==9395== For counts of detected and suppressed errors, rerun with: -v ==9395== Use –history-level=approx or =none to gain increased speed, at ==9395== the cost of reduced accuracy of conflicting-access information ==9395== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 94 from 25) For the most part, you can ignore most suppressed errors, but you might find it useful to enable the verbose flag (-v) to enable more detailed reports. Please consider reading the documentation for more information on this. Although the documentation examples should be reasonably easy to follow, in order to practice your understanding of what is being reported for various synchronization bugs, you might want to : consider trying to intentionally introduce various synchronization errors in the producer-consumer code and understanding the resulting helgrind report. For example, comment out in the producer function the line which acquires the region_mutex. What does the Assignment Project Exam Helphelgrind report tell you? Although in this assignment the [email protected] tool is probably sufficient for detecting most mistakes you could run into, it is not a “catch-all” solution for synchronization bugs and it is constantly being enhanced with more features. Once again, for the types of errors detected by helgrind, please read the Documentation (http://valgrind.org/docs/manual/hg-manual.htmlQQ: 749389476) carefully. Basic tester We are also providing a self-tester that checks some basic properties of your code. The tester will check that all cars from a given lane enter the intersection in the right order (cars don’t skip over others), by inspecting your output (see the comments from the starter code on what output format is expected). The tester will also check that all cars make it out of the intersection in the right quadrant, according to their out_dir. This is why you should not free the out_cars, and you should not modify the traffic code. An example of how to run the student tester: $ ./student_tester schedule_file1 schedule_file2 etc.. The result of your run will be placed in a file called results.txt. If there are no errors in the result file, then your code passes some basic sanity checks. You should use more complex schedules than the one we gave you, and carefullytailored schedules that capture various corner cases. Once again, this tester is just a basic checker and should not preclude you from using valgrind to track down concurrency bugs, as well as thoroughly testing your code! Submission You will submit the cars.c file that contains your implementation, along with the files required to build your program (including the provided traffic.h, Makefile, and traffic.c, which you should not modify). Do not submit executables or object files! Additionally, you must submit an INFO.txt file, which contains as the first 3 lines the following: your name your UtorID(s) the svn revision number for your last submission. As a general rule, we will always take A paragraph titled “Discussion”, which discusses whether starvation can happen when using this monitor. Describe your rationale. Aside from this, please feel free to describe in a separate paragraph(s), any problems you’ve encountered, what isn’t fully implemented (or doesn’t work fully), any special design decisions you’ve taken (as long as they conform to the assignment specs), etc. Do not add any subdirectories in your A2 directory. The files mentioned above should be the only things you need to add to your submission in your repository.Marking scheme We will be marking based on correctness (90%), and coding style (10%). [email protected] sure to write legible code, properly indented, and to include comments where appropriate (excessive comments are just as bad as not providing enough comments). Code structure and clarity will be marked strictly!powered by Jekyll-Bootstrap-3 (http://jekyll-bootstrap-3.github.io/preview/#/theme/Bootstrap) and Twitter Bootstrap 3.0.3 (http://getbootstrap.com)
Overview In this assignment, you will achieve the goal of hijacking (intercepting) system calls by writing and installing a very basic kernel module to the Linux kernel. Here is what “hijacking (intercepting) a system call” means. You will implement a new system call named my_syscall, which will allow : cstutorcsyou to send commands from userspace, to intercept another preexisting system call (like read, write, open, etc.). After a system call is intercepted, the intercepted system call would log a message first before continuing performing what it was supposed to do. For example, if we call my_syscall with command REQUEST_SYSCALL_INTERCEPT and target system call number __NR_mkdir (which is the macro representing the system call mkdir) as parameters, then the mkdir system call would be intercepted; then, when another process calls [email protected], mkdir would log some message (e.g., “muhahaha”) first, then perform what it was supposed to do (i.e., make a directory). But wait, that’s not the whole story yet. Actually we don’t want QQ: 749389476mkdir to log a message whenever any process calls it. Instead, we only want mkdir to log a message when a certain set of processes (PIDs) are calling mkdir. In other words, we want to monitor a set of PIDs for the system call mkdir. Therefore, you will need to keep track, for each intercepted system call, of the list of monitored PIDs. Our new system call will support two additional commands to add/remove PIDs to/from the list. When we want to stop hijacking a system call (let’s say mkdir but it can be any of the previously hijacked system calls), we can invoke the interceptor (my_syscall), with a REQUEST_SYSCALL_RELEASE command as an argument and the system call number that we want to release. This will stop intercepting the target system call mkdir, and the behaviour of mkdir should go back to normal like nothing happened. Checklist Here is a checklist that should help get you started, and to make sure that you won’t forget the important things: 1. Find your SVN repository on MarkUs (see “Submission”), and do a checkout to make sure you can access it. The starter code files can be found on the teaching lab machines (either the servers or workstations), under /u/csc369h/winter/pub/a1-starter/starter_code.tgz 2. Test that you have access to the VM in the teaching labs (instructions below). 3. Download the disk image for the virtual machine here (gzipped) (http://www.cs.toronto.edu/~bogdan/stuff/UbuntuServer-CSC369-upd.zip). On the host computer (your laptop or a lab computer), use a virtual machine software (VirtualBox or VMware) to create a virtual machine using the the disk image you downloaded (instructions to follow below). 4. Read and understand the existing code in the starter code. This is an important step of this assignment, and you should not start writing your own code before you have a good understanding of the starter code. 5. Implement the new kernel module by completing source file “interceptor.c”. Sections that need to be completed are marked with the 程序代写代做 CS 编程辅导 TODO tag). Do NOT modify the header file “interceptor.h”. 6. Make sure to test as you go. You should first make sure that the commands to intercept and deintercept work well, before attempting to implement the monitoring commands. 7. Testing and debugging (must be done in the virtual machine): a. Check out your code inside the virtual machine. c. Implement the intercept and release commands. d. Compile the test_intercept.c program using gcc. e. Test your code using sudo ./test_intercept, and make sure that all tests pass. f. Implement the monitoring/un-monitoring commands. g. Compile the test_full.c program using gcc.: h. Test your code using sudo ./test_full, and make sure that all tests pass. 8. Submit your code on time. See “Submission” for more details. 9. Congratulations! You now have some great hands-on experience with the Linux kernel! Assignment Project Exam HelpYou can now be proud of having programmed a Linux kernel module. You know what else are commonly implemented as kernel modules? Device drivers! Although they are more complex, you now technically have the basis to try to write one. Isn’t that cool? GoalThe goal of this assignment is to learn more about system calls and to use synchronization mechanisms. For this assignment you will be writing a very basic kernel module that intercepts system calls and monitors processes on demand. Requirements In order to be able to issue our own hijacking commands from userspace, we need a new system call that takes as parameters the command, the system call number (to be intercepted), and (for monitoring) a pid. Instead of adding a new system call, which can be tricky, our new system call my_syscall will be installed in place of an unused system call in the system call table. We will connect my_syscall to the entry number MY_CUSTOM_SYSCALL (in effect, entry 0 which is mostly unused). The new system call my_syscall, defined as follows: int my_syscall(int cmd, int syscall, int pid); will serve as an interceptor and will receive the following commands from userspace: a. REQUEST_SYSCALL_INTERCEPT: intercept the system call syscall b. REQUEST_SYSCALL_RELEASE: de-intercept the system call syscall c. REQUEST_START_MONITORING: start monitoring process pid for system call syscall, i.e., add pid to the syscall’s list of monitored PIDs. A special case is that if pid is 0 then all processes are monitored for syscall, but only root has the permission to issue this command (see the comments for my_syscall in the starter code for more details). d. REQUEST_STOP_MONITORING: stop monitoring process pid for system call syscall, i.e., remove pid from the syscall’s list of monitored PIDs. Kernel module operation Your kernel module must, upon initialization, replace the system call table entry for the MY_CUSTOM_SYSCALL number, with the my_syscall function. When the module is released, it must restore this system call to its original routine. As a result, when your kernel module is loaded, any subsequent invocations of the system call number MY_CUSTOM_SYSCALL from userspace, will issue four types of commands, to intercept or release a given system call, and to start and stop monitoring a pid for a specific syscall. You must implement the my_syscall function accordingly. 1. REQUEST_SYSCALL_INTERCEPT and REQUEST_SYSCALL_RELEASE. When an intercept command is issued, the corresponding entry in the system call table will be replaced with a generic interceptor function (discussed later) and the original system call will be saved. When a REQUEST_SYSCALL_RELEASE command is issued, the original saved system call is restored in the system call table in its corresponding position. 2. REQUEST_START_MONITORING and REQUEST_STOP_MONITORING : Monitoring a process consists of the module logging into userspace some information about the process and the system call: the system call number, the parameters of the system call, and the pid of the process. When a REQUEST_START_MONITORING command comes through our custom system call, the kernel module must record internally that the pid passed as a parameter should be monitored for the syscall number (passed as a parameter as well). The monitoring can be done for a specific pid, or for all pids (in which case the pid parameter for my_syscall will be 0). Ok, but I still don’t understand, what does it mean to monitor a pid? And what does the generic interceptor function do? Let’s start with the monitoring. We have established that once the user issues a monitoring command, the kernel module records internally that https://tutorcs.compid should be monitored whenever it issues system call number syscall (it will be placed in a monitored list – see details in starter code). We have also established that the generic interceptor function is what each intercepted system call will reach. In other words, whenever we reach the generic interceptor, we know that the system call is being intercepted (otherwise we would not reach this). If the pid of the process issuing the system call is being monitored, that means that we must print some information to a log. The log message will simply contain the system call number and the arguments, as well as the calling process’s pid. We have provided you in the starter code with a log_message macro, which takes care of sending a message to the system log. You can check the log using the dmesg command. As you might expect, regardless if a pid is monitored or not, the generic interceptor must eventually (once it’s done logging, if applicable), call the original system call to allow normal operation of all processes in the system. Alright, but what if a process exits before the user can issue a system call to stop monitoring it? Good question! When your kernel module initializes, you should also hijack the exit_group system call (with number __NR_exit_group), by replacing it in the system call table with your own custom function my_exit_group. Of course, make sure to save the original exit_group function, and to restore it when your kernel module is unloaded. Implementing the my_exit_group function should be simple: all you have to do is to remove the pid of the exiting process from all kernel module’s internal bookkeeping on monitored processes, then call the original exit_group function. Error Conditions You must make sure to check any possible misuse of the commands. In case of a misuse, you should return a proper error code (e.g., -EINVAL, -EPERM, google “Linux error code” for more information on error codes). Here is a list of things you should keep in mind: A. For each of the commands, check that the arguments are valid (-EINVAL): The syscall number must be valid: not negative, not > NR_syscalls (the last syscall number in the table), and not MY_CUSTOM_SYSCALL itself (for obvious reasons). The pid must be valid for the monitoring commands. It cannot be a negative integer, and it must be an existing pid (except for the case when it’s 0, indicating that we want to start/stop monitoring for all pids). If a pid belongs to a valid process, then the following call is not NULL: pid_task(find_vpid(pid), PIDTYPE_PID) B. Check that the called has the right permissions (-EPERM): For the first two commands, we must be root (see the current_uid() macro), : to be able to intercept or release system calls. For the last two commands, the following logic applies: Is the calling process root? if so, all is good, no doubts about Assignment Project Exam Helppermissions. If it is not, then check if the pid requested is owned by the calling process Also, if pid is 0 and the calling process is not root, then access is denied (monitoring all pids should only be allowed for [email protected] superuser, for obvious reasons). C. Check for correct context of commands (-EINVAL): Cannot de-intercept a system call that has not been intercepted yet. Cannot stop monitoring for a pid that is not being monitored, or if the system call has not been intercepted yet. If the system call has not been intercepted yet, a command to start monitoring a pid for that syscall is also invalid. D. Check for -EBUSY conditions: If intercepting a system call that is already intercepted. If monitoring a pid that is already being monitored. What if a stop monitoring request comes in for a specific PID (let’s call it P), for a syscall that monitors all PIDs? Is that an error or should we treat this as a special case? For this assignment, you can assume that this is an error and simply return -EINVAL. General information 1. You must use the starter code provided, which gives you detailed instructions on what you need to implement. Please make sure to implement all the parts indicated using detailed TODO comments. Please make sure to first attend the tutorial which will 程序代写代做 CS 编程辅导 help you write a simple kernel module and show you how to use printk statements for debugging. See the tutorial notes as well. 2. Your assignment will be tested on a virtual machine on CDF (aka teaching labs). You can access this virtual machine from your teaching labs account, or you can download the provided virtual machine disk image and install it on your personal computer through a virtualization solution (for example, free software include VMWare Player, VirtualBox, etc.) 3. We strongly recommend that you do NOT use the virtual machine for development, but rather only for testing and debugging. While working on this assignment, it is quite likely you will crash the kernel and although you can kill and restart the VM, there will be no guarantee that your code will still be there (the VM tools on the teaching labs won’t guarantee you safe snapshots). To prevent your hard work from possible data corruption, either do an SVN checkout inside the VM and use your repository to commit your code periodically from within the VM, or make sure to at least : back up your code periodically (e.g., by scp-ing it back to your teaching labs account). Accessing the Virtual Machine on the teaching labsAssignment Project Exam Help Guidelines for accessing the VM on the teaching labs can be found here (VirtualMachineInstructions.txt). Please make sure to follow the instructions carefully. Setup VM On Your Own MachineQQ: 749389476 Note: Your assignment has to ultimately be tested on a teaching lab machine. However, if you wish to develop it and test it first on your own machine, https://tutorcs.comusing virtualization software (*do not test your assignment directly on your computer!*), then we will provide some basic instructions on how to do so. Since VirtualBox is one of the most portable (as well as free) virtualization software, here (VirtualBoxInstructions.html) and here (http://www.cs.toronto.edu/~dbkats/csc369virtualboxinstructions.html) are some basic guidelines on how to install the VM image in VirtualBox on your computer (of course, many tutorials can be found online as well, so feel free to consult other sources if something does not work well for your own machine). Implementation details 1. Since the number of system calls is rather small (~300), and for performance reasons, you must maintain the system call information in an array. Each array element will contain information, as described in the mytable struct: typedef struct { asmlinkage long (*f)(struct pt_regs); int intercepted; int monitored; int listcount; struct list_head my_list; } mytable; 2. You must use a linked list for storing information about the monitored processes; using an array of fixed size is entirely inadequate (because the search time will be the same as a linked list, the implementation complexity will be the same, but the size of the array will limit the number of entries). 3. The system call table is exported by the void* sys_call_table[NR_syscalls], present in one of the kernel source files from the VM image on the teaching labs. If you wish to configure your own kernel image and re-compile it, you can modify the source code by adding the following two lines in the /usr/src/linux-source-2.6.32/arch/x86/kernel/i386_ksyms_32.c 程序代写代做 CS 编程辅导 file: extern void* sys_call_table[]; EXPORT_SYMBOL(sys_call_table); then recompile the kernel. Again, our virtual machine image already has these changes in place. 4. Since the 2.6 kernel is preemptive, you must protect access to shared data. You will be using spinlocks for this purpose. The use of spinlocks is fairly simple and you have been shown some examples in one of the tutorials. 6. Logging the system call will be done using the : cstutorcslog_message macro, defined in the interceptor.h header file. 7. For testing, you can use the provided tester programs. After you compile a test program (the provided Makefile only compiles your interceptor module, not any tester!), remember to run theAssignment Project Exam Help tester using sudo privileges in the VM. To facilitate your testing, you should first try to implement the commands to intercept and release system calls. When you are ready to test these, use the [email protected]_intercept.c tester. Once all tests pass, you can proceed to implementing the monitoring commands. To test all commands (both related to intercepting and to monitoring), you can use the test_full.c tester.Testing your code To help you test your code, we have provided two testers, which you will also find in your repositories. Tohttps://tutorcs.com encourage you to test as you go, we are providing you with two testers: test_intercept.c – tests if your intercept and de-intercept commands work correctly. You should first implement these and make sure the tester passes all cases. test_full.c – tests if all commands (including intercept, release, and both monitoring commands) work properly. This is a superset of the first tester, and you should only use once your code passes the first tester. Other Useful Tips Again, run tests ONLY in the virtual machine, NOT native computer, unless you hate your laptop. Once more, we strongly recommend that you do NOT use the virtual machine for development, but rather only for testing and debugging. Since it is quite likely you will crash the kernel and there will be no guarantee that your code will be intact. To prevent your hard work from possible data corruption, either do an SVN checkout inside the VM and commit your code periodically, or make sure to at least back up your code periodically. Reading and understanding code is as important as (if not more 程序代写代做 CS 编程辅导 important than) writing code. The comments in the starter code have a lot of information, make sure to read them carefully. Remember that when we de-intercept a syscall, the original system call must be restored in the system call table. For that you must properly store the original system call before replacing it. For debugging, learn how to use the printk function, which prints messages to kernel log. See tutorial notes as well. Use dmesg command to check the kernel log. Submission You will submit the interceptor.c : file that contains your implementation, along with the files required to build your program (including the provided interceptor.h, Makefile, and Kbuild, which you should not modify). Do not submit executables, or tester files! For those working in pairs, please make sure to commit to the group repository. Assignment Project Exam HelpIf you previously had trouble forming groups on A0, please contact me in advance. Do not leave this to the last minute, technical trouble with your repository will not get you an extension! Additionally, you must submit an INFO.txt file, which contains as the first 3 lines the following: your name your UtorID(s) the svn revision number for your last submission. As a general rule, we will always take the last check for us that we did not miss a revision when we retrieve your code via MarkUs. Aside from this, please feel free to describe problems you’ve encountered, what isn’t fully implemented (or doesn’t work fully), any special design decisions you’ve taken, etc. Marking schemepowered by Jekyll-Bootstrap-3 (http://jekyll-bootstrap-3.github.io/preview/#/theme/Bootstrap) and Twitter Bootstrap 3.0.3 (http://getbootstrap.com)
data is in {‘x’:特å¾, ‘label’:æ ‡ç¾}the form of a dictionary labelbased on featuresthe value to be predicted. Problem 1 (6 points): Binary Perceptron models.pyThe completed class PerceptronModeland train.pyfunction train_perceptron: init(self, dimensions): PerceptronModelInitialized weight parameters. The weight variables need to be saved as Parameter()objects with a dimension of 1× dimensions. forward(self, x): Calculates the dot product of the stored weight vector and the given input, returning a tensor object. get_prediction(self, x): Returns 1 if the dot product is non-negative, otherwise returns -1. train_perceptron: Loop through the dataset, update the misclassified samples, and stop training when there is no error in one traversal. Run python autograder.py -q q1the test, -nographicsoptionally without graphics. The autoscorer should complete within 30 seconds when running without graphics correctly. Neural Network Tips Subsequently, we will implement models such as nonlinear regression, handwritten digit classification, language recognition, and handwritten digit classification with convolutional neural networks. Create trainable parameters : define the weight matrix Mand bias vector B, the corresponding codes are m = Tensor(2, 1)and b = Tensor(1, 1). Calculate predicted values : Define a linear layer to calculate predicted values predicted_y = self.Linear_Layer(x). Calculate loss : Use the mean squared error loss function mse_lossto calculate the loss loss = mse_loss(predicted_y, y). Training the network : Initialize the optimizer optim.Adam(self.parameters(), lr=lr). During training, each iteration requires resetting the gradient, calculating the predicted value, calculating the loss, calculating the gradient, and updating the weight. Problem 2 (6 points): Nonlinear regression Complete the class models.pyin RegressionModel, the function train.pyin train_regression, and the function losses.pyin regression_loss: RegressionModel.init: Complete necessary initialization. RegressionModel.forward: Returns a node of size batch_size×1, representing the model prediction. regression_loss: Computes the loss for a given predicted output and target output. train_regression: Train the model with gradient-based updates. A full score is achieved if the model loss average reaches 0.02 or less, and python autograder.py -q q2the test is run. Question 3 (6 points): Number classification Complete the class models.pyin DigitClassificationModel, the function train.pyin train_digitclassifier, and the function losses.pyin digitclassifier_loss: DigitClassificationModel.forward(): Returns a node of size batch_size×10, where a higher score indicates a higher probability that the number belongs to a specific category. Use cross_entropyas loss function : Do not use the ReLU activation function in the last linear layer of the network. The model must achieve at least 97% accuracy on the test set to score a point. Run python autograder.py -q q3the test. You can use dataset.get_validation_accuracy()the calculated verification accuracy to help determine when to stop training. Question 4 (7 points): Language Identification Complete the class models.pyin LanguageIDModel, the function train.pyin train_langaugeid, and the function losses.pyin languageid_loss: Question 6 (2 points): Attention Mechanism models.pyA working class that implements AttentionBlockthe “Scaled Dot – Product Attention” formula softmaxleft(Mleft( rac{(Q)(K)^{T}}{sqrt{d}_{k}} ight) ight)(V), which involves operations such as matrix multiplication and transposition, and also requires the application of causal masks. Question 7 (0 points but interesting): Character – GPT Build a small generative model trained on Shakespeare’s play text to generate the next character. You’ll need to gpt_model.pyedit the code in [ 1 ] to implement Transformer_Blockthe class forwardfunctions and GPTclass forwardfunctions. Once the model is trained and running python chargpt.py, you can experiment with adjusting the network size, changing the training text, and so on. submit Upload the edited Python file to Gradescope. If collaborating with others, only one person should submit and tag the other to avoid duplicate submissions that could be mistakenly identified as academic misconduct. The staff reference solution takes about 12 minutes to run the full project automatic grader. If the code takes too long to run, check its efficiency. CS188 Project 5 Machine Learning
Markov Decision Processes 1Assignment Weight Read everything below carefully as this assignment has changed term-over-term. 2Objective In some sense, we have spent the semester thinking about machine learning techniques for various forms of function approximation. It’s now time to think about using what we’ve learned in order to allow an agent of some kind to act in the world more directly. This assignment asks you to consider the application of some ofAssignment Project Exam Help the techniques we’ve learned from reinforcement learning to make decisions. requirements will likely have changed.Add 3Procedure 3.1 The Problems Given to You You are being asked to explore Markov Decision Processes (MDPs): 1. Come up with two interesting MDPs. Explain why they are interesting. They don’t need to be overlycomplicated or directly grounded in a real situation, but it will be worthwhile if your MDPs are inspired by some process you are interested in or are familiar with. It’s ok to keep it somewhat simple. For the purposes of this assignment, though, make sure one MDP has a ”small” number of states, and the other MDP has a ”large” number of states. The judgement and rationalization of what is “small” and “large” will be up to you. For initial intuition, 200 states is not considered “large”. Additionally, neither of your MDPs you choose should be a grid world problem. 2. Solve each MDP using value iteration as well as policy iteration. How many iterations does it take toconverge? Which one converges faster? Why? How did you choose to define convergence? Do they converge to the same answer? How did the number of states affect things, if at all? 3. Now pick your favorite reinforcement learning algorithm and use it to solve the two MDPs. How doesit perform, especially in comparison to the cases above where you knew the model, rewards, and so on? What exploration strategies did you choose? Did some work better than others? Extra Credit Opportunity: Analysis writeup is limited to 8 pages. The page limit does include your citations. Anything past 8 pages will not be read. Please keep your analysis as concise while still covering the requirements of the assignment. As a final check during your submission process, download the submission to double check everything looks correct on Canvas. Try not wait until the last minute to submit as you will only be tempting Murphy’s Law. 3.2 Acceptable Libraries The algorithms used in this assignment are relatively easy to implement. Existing implementations are easy to find too. Below are java and python examples. • bettermdptools (python) https://github.com/jlm429/bettermdptools • BURLAP (java) http://burlap.cs.brown.edu/ 4 Submission Details sectionSubmission Details You must submit: Add • A file named README.txt containing instructions for running your code. We need to be able to get to your code and your data. Providing entire libraries isn’t necessary when a URL would suffice; however, you should at least provide any files you found necessary to change and enough support and explanation so we can reproduce your results on a standard Linux machine. • A file named yourgtaccount-analysis.pdf containing your writeup (GT account is what you log in with, not your all-digits ID). This file should not exceed 8 pages. The file yourgtaccount-analysis.pdf should contain: • A description of your MDPs and why they are interesting. • A discussion of your experiments. It might be difficult to generate the same kinds of graphs for this part of the assignment as you did in previous assignments; however, you should come up with some clever way to describe the kinds of results you produce. If you can achieve this visually all the better. However, a note of caution. Figures should remain legible in a 100% zoom. Do not try to squish figures together in specific sections where axis labels become 8pt font or less. We are looking for clear and concise demonstration of knowledge and synthesis of results in your demonstrations. Any paper that solely has figures without formal writing will not be graded. Be methodical with your space. Note: we need to be able to get to your code and your data. Providing entire libraries isn’t necessary when a URL would suffice; however, you should at least provide any files you found necessary to change and enough support and explanation so we can reproduce your results on a standard linux machine. 5 Feedback Requests When your assignment is scored, you will receive feedback explaining your errors and successes in some level of detail. This feedback is for your benefit, both on this assignment and for future assignments. It is considered a part of your learning goal to internalize this feedback. We strive to give meaningful feedback with a human interaction at scale. We have a multitude of mechanisms behind the scenes to ensure grading consistency with meaningful feedback. This can be difficult, however sometimes feedback isn’t always as clear as you need. If you are confused by a piece of feedback, please start a private thread on Ed and we will jump in to help clarify. The phrase ”as long as you participate in this journey of exploring, tuning, and analyzing” is key. We take thisAssignment Project Exam Help very seriously and you should too. assignments of other students (even across sections and previous courses), or website repositories.Add What does it mean to be original? It is well known that for this course we do not care about code. We are not interested in your working out the edge cases in k-nn, or proving your skills with python. While there is some value in implementing algorithms yourselves in general, here we are interested in your grokking the practice of ML itself. That practice is about the interaction of algorithms with data. As such, the vast majority of what you’re going to learn in order to master the empirical practice of ML flows from doing your own analysis of the data, hyper parameters, and so on; hence, you are allowed to steal ML code from libraries but are not allowed to steal code written explicitly for this course, particularly those parts of code that automate exploration. You will be tempted to just run said code that has already been overfit to the specific datasets used by that code and will therefore learn very little. How to cite: Your README file will include pointers to any code and libraries you used. If we catch you… 7 Version Control ReferencesAdd
Note: Users running Windows should replace the colon (:) with a semicolon (;) in the classpath argument for all command listed below. Getting started Run the following command to build your compiler, and then run all the provided tests: “` mvn clean package java -cp “chocopy-ref.jar:target/assignment.jar” chocopy.ChocoPy -pass=..s –run –dir src/test/data/pa3/sample/ –test “` In the starter code, only one test should pass. Your objective is to implement a code generator that passes all the provided tests and meets the assignment specifications. Generating assembly files You can also run the code generator on one input file at at time. In general, running the code generator on a ChocoPy program is a three-step process. 1. First, run the reference parser to get an AST JSON: java -cp “chocopy-ref.jar:target/assignment.jar” chocopy.ChocoPy –pass=r –out 2. Second, run the reference analysis on the AST JSON to get a typed AST JSON: java -cp “chocopyref.jar:target/assignment.jar” chocopy.ChocoPy –pass=.r –out 3. Third, run your code generator on the typed AST JSON to get a RISC-V assembly file: java -cp “chocopyref.jar:target/assignment.jar” chocopy.ChocoPy –pass=..s –out The src/tests/data/pa3/sample directory already contains the typed AST JSONs for the test programs (with extension .out.typed); therefore, you can skip the first two steps for the sample test programs. Executing an assembly program using the Venus simulator To run a generated RISC-V program in the Venus-164 execution environment, run: java cp “chocopy-ref.jar:target/assignment.jar” chocopy.ChocoPy –run Chained commands For quick development, you can chain all the stages to directly execute a ChocoPy program: java -cp “chocopy-ref.jar:target/assignment.jar” chocopy.ChocoPy –pass=rrs –run You can omit the –run in the above chain to print the generated assembly program instead of executing it. Running the reference implementation To observe the output of the reference implementation of the code generator, replace –pass=rrs with –pass=rrr in any command where applicable. Assignment specifications See the [PA3 specification][] on the course website for a detailed specification of the assignment. Refer to the [ChocoPy Specification][] on the CS164 web site for the specification of the ChocoPy language. Refer to the [ChocoPy Implementation Guide][] on the CS164 web site for the conventions used by the reference compiler and the starter code. Receiving updates to this repository Add the upstream repository remotes (you only need to do this once in your local clone): git remote add upstream https://github.com/cs164fa2024/pa3-chocopy-code-generation.git Submission writeup Team member 1: Team member 2: (Students should edit this section with their write-up) CS164 pa3 chocopy Programming Help :éžä¸ä»‹, 直接è”系程åºå‘˜æœ¬äºº
Introduction In this project, you will implement a 2 player game in MIPS assembly: Connect 4 aka Four-in-line. The game consists a board representing the play area. Two players face each other and drop tokens, one at a time, until one of them manages to place four in line! Start early SOMETHING! Game mechanichttps://.comPlayer 2, it’s your turn. Select a column to play. Must be between 0 and 6|_|_|_|_|_|_|_| |_|_|_|_|_|_|_| |_|+|+|+|+|_|_| |_|_|_|_|_|_|_| |_|_|_|_|_|_|_| Congratulations player 2. You won! 0 1 2 3 4 5 6 |_|_|_|_|_|_|_| |_|+|_|_|_|_|_| |_|+|_|_|_|_|_| |_|+|_|_|_|_|_| |_|+|_|_|_|_|_| |_|_|_|_|_|_|_| Congratulations player 2. You won! Your assignment Plan 1. Think of which functions you will need to implement, and what they will do. 1) Start from the main function and split your program into multiple steps. 2. Think of possible invalid user inputs, and how they will impact the program negatively. 1) Board bounds. 2) Filling a column to the top. Implement Implement the MIPS assembly code that executes the game described above. Your program will manage all interactions with the user and the board: 1. It begins by displaying a welcome message and an explanation of what the user should do. How is the game played? 2. Print the empty board. 3. Then, the game begins, and your program will:1) Ask player 1 to play: ▪ Ask and validate user input (MARS will crash if the user gives no input or a letter, this is fine!) ▪ Don’t allow the user to select a non-existing tile. ▪ Don’t allow the user to select a full column. ▪ “Drop” the token into the board at the requested column. ▪ Check for a winning condition. 2) Ask player 2 to play: ▪ Ask and validate user input (MARS will crash if the user gives no input or a letter, this is fine!) ▪ Don’t allow the user to select a non-existing tile! ▪ Don’t allow the user to select a full column! ▪ “Drop” the token into the board at the requested column. ▪ Check for a winning condition. 3) Repeat until one of the players wins or the board is full. 4. In the end, print a message letting the winning player know the game has ended. The welcome message Bear in mind that you do your own thing, as long as it fits the project! So use the welcome message to explain to the user exactly how it should play the game. Explain the rules, and how the player can score points. User input Your program needs to ask the user in which column Assignment Project Exam Helphe/she wants to drop a token. If the user inputs an invalid value, you inform the user of that and ask again. You must validate the user input! The exact way you implement this is up to you. You must ask the user to inputhttps://.com something to select the column. Representing the board Feel free to implement all data structures that you need. However, it is suggested you’d better use matricesAdd . You can implement your board as a matrix of words to keep the status of the game. Here is one suggestion: board: .word 0, 0, 0, 0, 0, 0, 0,if(board[i][j] == 0) { print(‘_’) } else if(board[i][j] == 1) { print(‘*’) } else { print(‘+’) }0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 2, 1, 1, 0, 0, 0, 0, 2, 2, 2, 1, 2, 0, 0 Board size You can do one of two things: Easy route The board should be a 7×6 matrix. Configurable route You can make it configurable if you so wish, and then adjust for difficulty. AT THE TOP OF THE FILE! The simplest way is to name a number, like a #define in C. In MARS you can do that like this: .eqv BOARD_SIZE 42 # 7*6 .eqv BOARD_WIDTH 7 .eqv BOARD_HEIGHT 6 Then you can use the name instead of a number, i.e. in instructions that would normally use a number (the code is nonsense, don’t use it): lw $t0, 3($t1) -> lw $t0, N_PLAYS($t1) li Assignment Project Exam Help$t0, 3 -> li $t0, N_PLAYS beq $t0, 3, _label -> beq $t0, N_PLAYS, _label .word N_PLAYS arr: .word 0:100 -> arr: .word 0:BOARD_ELEMENTS Or ask the user Add Create variables, and ask what is the size of the board they want: Printing the board The board must be shown to the user. Check the example below if you are not sure how to proceed. The only requirement here is that Empty tiles must be empty, and players should have different tokens to represent the tokens. The following examples used an _ to represent empty tiles, * to represent “Player 1” tokens, and + to represent “Player 2” tokens. Ending the game The game should end when one of the players successfully drops 4 tokens in line. At that point, let the user know that the game has ended and who won.|_|_|_|_|_|_|_| |_|_|_|_|_|_|_| |_|_|_|_|_|_|_| |_|_|_|*|_|_|_| Select a column to play. Must be between 0 and 6 3 0 1 2 3 4 5 6 |_|_|_|_|_|_|_||_|_|_|_|_|_|_|2 |_|_|_|_|_|_|_|3 |_|_|_|+|_|_|_|5 |_|_|_|+|_|*|_|2|_|_|_|_|*|_|_| |_|_|+|+|+|*|_|1Congratulations player 2. You won.— program is finished running —Project Stages Stage 1 – Create the main loop logic and user interaction The tedious part of this program will be to create all the strings, display them to the user, and get user input. It is also the simpler bit. During the first stage, it is suggested you focus on creating an application that prints the strings to the user, implements the main loop, and asks the user for input (don’t forget to make sure the input column is valid!). At this stage you don’t have to worry about saving the input, etc. If you finish early, move on to stage 2 and try to display the board. You can edit the matrix manually to “simulate” some plays have occurred. Stage 2 – The board If you finish early, move on to stage 3 and try to find winning game conditions for horizontal and vertical 4-in-line. Stage 3 – Winning the game 0 1 2 3 4 5 6|_|_|_|_|_|_|_| |_|_|_|_|_|_|_| |_|_|_|_|_|_|_| |_|_|_|_|_|_|_| Since you know where the token was dropped, the easiest way, probably not the smartest, is to check the matrix entries around the dropped token. For example, this was the last dropped token:|_|_|_|?|_|_|_| |_|_|_|_|*|_|_| |_|_|_|_|_|_|?| |_|_|_|_|_|_|_|Helpful Tidbits Starting the code This is a very simple program in a higher-level language! But it is much more complex in assembly. As such, here is some advice for developing your program. Plan and start by writing high-level comments on how you plan to approach the problem: • If you are not sure what to write, start with the items in the “Your assignment” section of the project 🙂 • Then add detail to those comments. • If you need, write the program in a high-level language, draw a diagram, write pseudocode, and then translate that into MISP assembly. TestingAssignment Project Exam Help DO NOT TRY TO WRITE THE WHOLE PROGRAM BEFORE TESTING IT! • It’s the easiest way to get overwhelmed and confused without knowing what to do! • Implement small parts of the code and test them! Add Split your code into functions Use functions! They will help you manage the cognitive load. Here is a starting point! main: jal print_welcome jal display_board_main_loop: …_main_player1: j _main_loop Submission Submit a single ZIP file with your project named studentID_MidtermProj.zip (e.g., 2023141520000_ MidtermProj.zip). In the zip file, there should be NO folder, just the following files: • Your connect.asm file. (Put your name and student ID at the top of the file in the comments!) • A readme.txt file (DO NOT SUBMIT A README.DOCX/README.PDF. SUBMIT A PLAIN TEXT FILE. PLEASE.) which should contain: a) your name, b) your student ID, c) anything that does not work, d) anything else you think might help the grader grade your project more easily. Submit into the Blackboard. Let me know immediately if there are any problems submitting your work.Add
The goal is to create a budget planning and management system in a single Java NetBeans project. It should have a proper class structure and various functionalities, with all classes encapsulated. There must be a main class to run the application. Base Class: The Transaction class represents a transaction. It has two subclasses, Expense and Income. Unique Identifier: Each transaction gets a unique ID using java.util.UUID. Effective Amount: The Transaction class has a getEffectiveAmount() method. In Income sub – class, the effective amount is the same as the transaction amount. In the Expense subclass, different payment methods (cash, card, Alipay, ) have different fees (0%, 1%, 0.5% respectively), which are factored into the effective amount. Category Management: Manages expenses for a specific category like food or electricity bills. Attributes: Each category has a name, a fixed monthly limit, and a current expenditure (initially 0). Methods: There are methods to add expenses, check if the current expenditure exceeds the monthly limit, and reset the current expenditure to zero. Budget Management: Responsible for managing an entity’s budget. It stores all expenses and income. Data Storage: Choose an appropriate Java collection to store expenses and income, and document your choice in the code. Transaction Handling: Can add transactions. Adding an expense requires specifying the budget category. If adding an expense exceeds the category’s monthly limit, throw a MonthlyLimitExceededException (which you need to create). Category Operations: Can add new budget categories (with no initial expenses) and delete existing ones. Command – Line Interface: Add a text – based command – line user interface. Use XipuAI to learn how to code it. Functionality: The interface should support budget management based on the BudgetManager class. You can add extra features like calculating net earnings. Best Practices: Follow best practices for command – line interfaces, including exception handling and input validation. The interface should be user – friendly and easy to navigate. Code Standards: Keep the code neat, with proper indentation. Use suitable variable and method names following Java naming conventions. Comments: Add comments as needed and split code into appropriate methods. Write Javadoc comments for the BudgetManager class API and submit the generated BudgetManager.html file. Explanation: Explain which OOP features (like encapsulation) and principles (like the single – responsibility principle) you incorporated in your design. Define the features/principles and illustrate with code snippets if necessary. The section should be 1.5 – 2 pages long. Testing Approach: Describe how you tested different parts of the program. You can use JUnit (if familiar) or manually test in the main class. Functionality Tested: State what functionalities you tested and how, considering corner cases. You can include sample code. This section should be 1.5 – 2 pages long (excluding screenshots/code). User Interface Explanation: Explain how the user interface works, including its functionalities, navigation, exception handling, and input tolerance mechanisms. 3. Submission Instructions Submit the following files in the “Coursework 3 submission” Assignment activity on the Learning Mall Online: – A ZIP archive of the entire NetBeans project named “CPT206 CW3 Project studentId.zipâ€. – The BudgetManager.html Javadoc API documentation file. – The report as a PDF named “CPT206 CW3 Report studentId.pdfâ€, including the XipuAI conversation appendix. :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882
2. Questions 1-4 are for Part I and the rest are Part II.Add 4. Weighting is 50%. 5. Show all code for Questions 5 and 6. 6. Write answer in the box provided. 7. Answers should be legible.Figure 1: Two CCS Processes△ P=a.P1+b.c.P2 https://.comP.P (1) △ P2=b.P Add Q (2) △ Q1=a.Q Answer box for Q1Add Answer box for Q1 continuedAddFigure 2: An example Kripke model (b) Identify valid CTL formulae from the following with suitable justification. Convert the valid formulae you identified into their equivalent forms that are just using adequate sets. Here, AP={p1,p2,p3} and the converted formulae can only include temporal operators • (AG(p1 ⇒https://.comAX(p1∧ p2))) • AF(p1 ⇒ FG(p1Up2)) • AF(p1 ⇒ AXp2) • ((AFp ⇒Add assignmentchefEGq)UEFp) Answer box for Q2Add Answer box for Q2 continuedAddFigure 3: Two automata (a) Let Σ= {a,b,c,d} and let A1,A2 be two automata as shown in Figure 3. Answer the following questions: • bbabab∗ is accepted by A1? Is this true or false? Justify. • b∗d∗a∗b∗ is accepted by A2? Is this true or false? Justify. • bbbb∗a∗d∗c∗ is accepted by A1TA2? Is this true or false? Justify. • aabc∗d∗b ∈ Σ∗. Is this true or false? Justify. (b) In a timed automata which of the following are valid clock constraints and why? AssumeAssignment i. z >=12∧x==20 ii. x iv. z==x+Add assignmentchef5 iii. x−y =1000 Answer box for Q3Add Answer box for Q3 continuedAdd (a) We studied some approaches for using run-time verification for securing pacemakers. Using a suitable diagram, explain how such a monitor works, without being able to access the pacemaker directly. (b) Provide an example property as a timed automaton, which can be used by a run-time verification monitor, to determine if a pacemaker has been hacked by using an ECG sensor?Add Answer box for Q4Add Answer box for Q4 continuedAdd def swap1(a, b): def swap2(a, b): tmp = b a = a + b b = a b = a – b a = tmp a = a – b return a, b return a, bAdd Answer box for Q5Add Answer box for Q5 continuedAdd (a) You should need no more than 4 variables to model the traces and property. (b) This problem is called symbolic bounded model-checking.Figure 4: Two Finite State Machines (FSMs) running asynchronously on a single CPU. Integer type variables S1 and S2 capture the state value of FSMs.Add Answer box for Q6Add Answer box for Q6 continuedAdd Example Z3 API function and usage from z3 import IntSort, Solver, sat, DeclareSort, Consts, ForAll, from z3 import Function, sat, Or, And, RealSort, Implies, Exists # Defining a new sort (type) T = DeclareSort(‘T’) # Defining a binary function that works with type T and returns int type f = Function(‘f’, T, T, IntSort()) # New object of type Solver s = Solver() # Defining two variables of type T a, b = Consts(‘a b’, T) # Adding a forall example constraint to solver s.add(ForAll([a, b], a >= b)) # Example of using Exists s.add(Exists([a, b], a < b) # Example of using Implies (a > 0 => b < 10) s.add(Implies(a > 0, b < 10)) # Example of using implies inside another operator s.add(ForAll([a, b], Implies(a > 10, b < 100))) # Defining Variables of type Int x, y = Consts(‘x y’, IntSort()) # Adding example constraint of variables x and yAssignment Project Exam Help s.add(And(Or(x + y > 10, x – y < 100), x * y == 90)) # Defining Variables of type Real (float) j, k = Consts(‘j k’, RealSort()) # Adding example constraint of variables j and k s.add(And(Or(j + k > 10.90, j – k < 10/80), j * k == 90.3)) # Checking if a solution existsAdd ret = s.check() if ret == sat: print(s.model()) # Print the result of check if sat Scratch PadAdd Scratch PadAdd
Problem OverviewThe Amazon rainforest is the largest tropical rainforest on Earth, renowned for its unparalleled biodiversity and its key role in regulating global climate patterns[1,2]. Covering over 5.5 million square kilometers, it acts as a vital carbon sink by absorbing significant amounts of atmospheric COâ‚‚, thereby mitigating global climate change[2]. However, in recent decades, temperatures in the Amazon basin have shown a worrying upward trend, largely attributed to climate change[3].Within this vast rainforest (shown in green in Figure 1), temperatures are rising particularly rapidly in certain areas, leading to more frequent and intense “heat events†[4]. These heat events threaten local ecosystems by increasing the risk of forest fires and have broader implications for global climate stability. The occurrence and intensity of heat events in the Amazon are influenced by large-scale climate drivers that regulate global weather patterns [5]. In particular, four important ocean climate patterns—the El Niño-Southern Oscillation (ENSO), the Tropical South Atlantic (TSA), the Tropical North Atlantic (TNA), and the North Atlantic Oscillation (NAO)—play a key role in determining temperature variability in the Amazon. These climate patterns occur in different ocean regions surrounding the Amazon, as shown in Figure 1. 1. Data provided The temperature time series data and the values ​​of climate pattern indices (ENSO, NAO, TSA, TNA) are shown in Table 1 (Amazon temperature student.csv). index Measurement content scope explain ENSO Sea surface temperature anomalies in the Niño 3.4 region -3 to 3°C + indicates El Niño; indicates La Niña WITH THEM Normalized sea level pressure difference (Azores–Iceland) -4 to 4 + means stronger westerly winds and milder winters; – means the opposite TSA Sea surface temperature anomalies in the tropical South Atlantic -1 to 1°C + indicates warmer waters in the South Atlantic TNA Sea surface temperature anomalies in the tropical North Atlantic -1 to 1°C + indicates warmer North Atlantic waters Temperature thresholds for each month of each year (threshold.csv). 1. Goals Your goal is to build neural network models for two different tasks: Task A (Classification): Predict the occurrence of thermal events. Task B (Regression): Predict temperature. 4. Task A: Classification (Thermal Event Detection) Data Preparation (a) Define a binary variable Hot: If the temperature of a month exceeds a given threshold for that particular month, then the month is classified as Hot. (i) If the monthly temperature exceeds the monthly threshold, then Hot = 1. (ii) Otherwise, Hot = 0. (b) Create a bar graph summarizing the number of hot months per year. Model development (c) Randomly divide the dataset into training, validation, and test sets. (d) Preprocessing: Apply any necessary transformations to the training set, then apply the same transformations to the validation set. Keep arecord of all applied transformations. (e) Build a neural network classifier to predict thermal events: Define the architecture and hyperparameters (loss function, optimizer, batch size,learning rate, number of epochs). It is recommended that the total number of trainable parameters satisfy: [N_{params }
Project Conditions• Do not provide or show your assignment work to any other person, other than the teaching staff of COMP4336/9336. • Do not publish your assignment solution (data/code) via the Internet. For example, do not place your assignment in a public GitHub repository. Sharing, publishing, or distributing your assignment work even after the completion of COMP4336/9336 is not permitted.Project Topic: Data-Driven WiFi EnhancementObjectiveThis project is designed to deepen your understanding of the factors that influence retransmissions in WiFi networks, with a focus on how these retransmissions impact network performance. Through this project, you will gain handsAdd -on experience in data collection, performance analysis, and applying basic machine learning techniques to real-world network data.Learning Outcomes:• Practical Data Collection: You will learn how to effectively capture and analyse real-world WiFi traffic data, identifying key patterns related to retransmissions. • Performance Insight: By analysing the impact of retransmissions on network performance, you will develop a deeper understanding of the challenges in WiFi network management. • Machine Learning Application: You will apply machine learning to predict retransmission events, gaining insight into the predictive power of various network factors. • Critical Thinking and Innovation: The project will culminate in designing protocol enhancements that address the identified issues, fostering your ability to innovate within existing network frameworks. • Real-World Application: The skills and knowledge gained from this project will be directly applicable to solving real-world networking challenges, preparing you for careers in network engineering, cybersecurity, or research and development.Fig 1. Project Tasks Project Tasks and Components1. Data Collection:You will use Wireshark or other protocol analysers to collect WiFi traffic data from operational WiFi networks. Your goal will be to gather a rich dataset that includes both normal and retransmitted packets. Pay attention to the following:• Time and Location Diversity: You should capture data from at least two crowded locations where you can observe significant WiFi retransmissions. Note that in light-traffic scenarios, there would be not many WiFi collisions and retransmissions, making data collection not particularly useful for your project. Therefore, it is crucial to choose locations with high network activity and high density of WiFi Access Points (APs), such as libraries, UNSW CSE buildings, or similar busy areas. For a given location, collect data across different times of the day and over multiple days spanning different weeks. You have the flexibility to choose the locations and times for your data collection. • Data Volume: Collect enough data to ensure that your conclusions are statistically valid and robust enough for training machine learning models. Adequate data will help the model generalize well and minimize the risk of overfitting.Add2. Performance Analysis: Once the data is collected, you will analyze the impact of WiFi collisions and retransmissions on network performance. This involves identifying retransmitted packets and calculating the delay caused by these retransmissions. You will explore how retransmission frequency correlates with overall network latency, throughput, and efficiency, and visualize these relationships to uncover patterns and insights. Task details: wlan.fc.retry == 1• Retransmission Analysis: To analyze the retransmissions in your dataset, use Wireshark or an equivalent packet sniffing tool to identify and filter retransmitted packets. Focus on the 802.11 frame retransmission indicators. For each packet, track how many times it has been retransmitted (e.g., once, twice, etc.). Present the retransmission distribution in your dataset, showing how often packets are retransmitted. You can display this using a bar chart or histogram, where the x-axis represents the number of retransmissions and the y-axis represents the number of occurrences.• Network Performance Analysis: Do the network performance analysis by choosing one of the most congested networks in your dataset. Next, calculate key network performance metrics such as latency, throughput, and efficiency. o Latency: Locate a data packet and the corresponding ACK frame. Calculate the latency as the difference between the ACK packet’s timestamp and the data packet’s timestamp. o Throughput: Calculate the throughput as the total amount of data successfully transmitted over a given period. You can compute this by summing up the data packet sizes filtered for a given device (MAC address) and dividing by the duration of the capture. o Efficiency: Evaluate the efficiency by comparing the amount of useful data transmitted to the total number of transmissions, factoring in retransmissions. To present your findings, create graphs and charts that show the relationship between retransmission frequency and network performance metrics such as latency, throughput, and efficiency. Use visualization tools like Matplotlib or Seaborn to create plots that clearly depict these relationships.3. Machine Learning Evaluation: You will apply machine learning techniques to evaluate the factors that influence retransmissions. By extracting features such as signal strength, packet size, and channel utilization, you will train a machine learning model to predict the likelihood of pachttps://.comket retransmission. This model will help you understand which factors are most critical in determining whether a packet is successfully transmitted or needs to be retransmitted. • Feature Extraction: Extract relevant features from your dataset, such as signal strength, packet size, and channel utilization, to serve as inputs for your machine Add learning model. • Model Training and Evaluation: Train some basic machine learning models e.g., kNN, Random Forest, and SVM, to predict packet retransmissions. Evaluate the models’ performance using a portion of your dataset and analyze the results. • Model Interpretation: Discuss which factors/features are most influential in predicting retransmissions and the significance of these factors for network performance.Resources for machine learning: • Books & Online Documents ▪ Scikit-learn Documentation: The official documentation is comprehensive, providing guides, examples, and references that are invaluable for both beginners and advanced users. ▪ Python Machine Learning by Sebastian Raschka • Online Courses ▪ Scikit-learn in Python for Machine Learning ▪ Scikit-learn for Beginners 4. Protocol Enhancement Design: o Based on the insights gained from your machine learning analysis, you will propose enhancements to the WiFi protocol aimed at reducing retransmissions and improving overall network performance. While implementing your proposed enhancement is not required, you should demonstrate your proposal’s feasibility through analysis and discussions. Here are some example enhancements you might consider (but don’t feel limited to these): ▪ Adaptive Retransmission Mechanisms: Design mechanisms that adjust retransmission strategies based on real-time network conditions, such as varying the backoff time or retransmission count based on signal strength or congestion levels. ▪ Dynamic Channel Allocation: Propose a method for the AP to dynamically switch the channel that reduces the likelihood of retransmissions. ▪ Error Control Enhancements: Suggest improvements to error detection and correction mechanisms to prevent the need for retransmissions under certain conditions. o Critical Thinking: Explain the rationale behind your proposed enhancements, considering trade-offs such as complexity, scalability, and potential impacts on existing WiFi standards. Evaluate how these changes might be implemented in real-world networks and what challenges might arise.Submission Requirements 1. Collected DataseAssignment Project Exam Helpt: Submit the dataset in a zip file, organized according to the template to be 2. PDF Report: Submit a single PDF report (maximum 10 pages) that includes: • A detailed explanation of the data collection process and detailed metadata of your datasets. • Analysis of the impact of retransmissions on network performance, including visualizations and Add insights derived from the analysis. • A discussion of the machine learning evaluation, covering feature extraction, model training, and model performance comparison. • The design and critical analysis of proposed protocol enhancements, including the rationale behind your decisions, trade-offs considered, and feasibility for real-world implementation. 3. Jupyter Notebook and Code: Submit all code used for data analysis and performance evaluation, either as standalone scripts or within the Jupyter notebook. Include a Jupyter notebook (Python 3) that demonstrates the machine learning process, from importing the dataset to training and evaluating models using scikit-learn (or other machine learning libraries). Ensure that the notebook is well-documented with explanations of each step.Submission Process:Marking Rubic: Task Excellent (85-100%) Average (50-84%) Poor (0-49%) Data Collection Data Analysis (25%) Professional analysis with detailed procedures. Visualizations are unclear or missing. The report is difficult to follow and lacks connection to course content. Machine Learning (25%) Dataset is preprocessed correctly. At least two machine learning Assignme algorithms are compared. Perform analysis to prove has adequate data collection. Clear examples https:/ of model usage and thorough analysis of performance are provided. Add Basic preprocessing is done, but comparison of models is nt Project Exlimited or not wellexplained. Some examples are unclear or missing, with limited analysis of model am Helpmodel s used without comparison, or analysis is severely lacking or omincorrect.coder Protocol Enhancem ent Design and Discusion (25%) Demonstrates strong understanding of wireless protocols and proposes innovative enhancements with clear feasibility and benefit analysis. Some understanding of wireless protocols is demonstrated, proposed enhancements appear to have some benefit, but lacks comprehensive discussion. Limited or no understanding of wireless protocols is demonstrated. Proposed enhancements are vague or unfeasible.Required Knowledge and Skills • Understanding of WiFi Protocols: Knowledge of the 802.11 WiFi standard, including packet structure and transmission mechanisms, and the ability to think creatively about protocol improvements. • Wireshark Proficiency: Familiarity with Wireshark or similar tools for WiFi network data capture and analysis. • Basic Python Programming: Ability to use Python for data analysis and machine learning, including libraries such as Pandas, Scikit-learn, and Matplotlib. Important Notes • Adopt a passive approach to data collection and limit your data collection to data that do not have any user privacy issues associated with it.End of Project Specs
Optimized Three-Dimensional Stencil Computation on Fermi and Kepler GPUsSiemens Corporate Technology, SC Siemens SRL Abstract—Stencil based algorithms are used intensively in to/from which the CPU or host thread can write/read, and scientific computations. Graphics Processing Units (GPU) based which is accessible by all multiprocessors. Furthermore, each implementations of stencil computations speed-up the execution multiprocessor also contains shared memory and registers significantly compared to conventional CPU only systems. In this which are split between the thread blocks and the threads, paper we focus on double precision stencil computations, which which run on the multiprocessor, respectively. With the are required for meeting the high accuracy requirements, : cstutorcsintroduction of the third and fourth generation general purpose inherent for scientific computations. Starting from two baseline GPU (GPGPU), the Fermi and the Kepler generations implementations (using two dimensional and three dimensional respectively [3], [4], the double precision performance has thread block structures respectively), we employ different increased, and a true cache hierarchy (L1/L2) and more shared optimization techniques which lead to seven kernel versions. Both memory are available. Furthermore, the global memory Fermi and Kepler GPUs are used, to evaluate the impact of bandwidth plays an important role since the performance of different optimization techniques for the two architectures. GPU cards designed for desktop PCs and notebook PCs. The as a function of its neighboring locations. This pattern is found results indicate that the ratio of execution time is roughly equal in several application domains, like image processing, computational fluid dynamics, weather prediction, quantum to the inverse of the ratio of power consumption. Keywords— stencil, GPU, double precision, Kepler, Fermi, physics. Previous studies have shown that, if regular Cartesian optimization grids are used, GPU based implementations are able to significantly speed up the execution compared to regular CPUhttps://tutorcs.combased implementations [5], [6]. I. INTRODUCTION Graphics Processing Units (GPUs) are dedicated processors, designed originally as graphic accelerators. Since The GPU is viewed as a compute device which is able to run a very high number of threads in parallel inside a kernel (a function, written in C language, which is executed on the GPU and launched by the CPU). The threads of a kernel are organized at three levels: blocks of threads are organized in a three dimensional (3D) grid at the top level, threads are organized in 3D blocks at the middle level, and, at the lowest levels, threads are grouped into warps (groups of 32 threads formed by linearizing the 3D block structure along the x, y and z axes respectively) [2]. The GPU contains several streaming multiprocessors, each of them containing several cores. The GPU (usually also called device) contains a certain amount of global memory 978-1-4799-6233-4/14/$31.00 ©2014 IEEE Research activities on stencil based computations have been reported long before the introduction of general purpose GPUs. These activities focused on the information transfer between nodes [7] and the relationship between partition shape, stencil structure and architecture [8]. Different optimization techniques have been reported more recently for GPU based stencil computations. The most often encountered optimization techniques used in the past are blocking at registers and at shared memory [9], [10]. Pre-Fermi GPUs did not have any cache memories, making the shared memory blocking technique vital for reducing memory access counts. Temporal blocking is another extensively used technique, with mixed performance improvements on GPUs [11], [12], [13]. Non-GPU architectures have also been used for stencil based computations [14]. The goal of the current work is to evaluate the performance of 3D stencil based algorithms on a series of recent GPUs. Previous research activities have focused on single precision computations. With the introduction of the Fermi and the Kepler architecture, the performance of double precision computations on NVIDIA GPU cards has increasedsubstantially. To meet the high accuracy requirements, inherent for scientific computations [15], [16], in the current work we focus on double precision computations. Starting from two baseline implementations, we employ different optimization techniques which lead to seven different kernel versions. Both Fermi and Kepler GPUs are used, to evaluate the impact of different optimization techniques for the two architectures.The paper is organized as follows. Section II first performs a brief introduction of the 3D stencil used herein. Next, the baseline implementations are introduced, followed by the Fig. 1. 7-point stencil used for the numerical solution of the unsteady heat diffusion equation. different optimized approaches. Section III displays the results obtained with the different kernel versions for different Fermi and Kepler GPUs. Finally, section IV draws the conclusions. II. METHODS For studying 3D stencil based algorithms implemented on A. Baseline GPU-based implementations In the following we introduce two baseline GPU-based implementations of the unsteady heat diffusion problem. For the first baseline implementation (called in the following 3DBase) each grid point is handled by a separate thread. Two buffers are allocated, one for the values at the previous time graphics processing units, we consider the 3D unsteady heat step and one for the values at the new time step. To eliminate conduction problem which is modeled as a second order : cstutorcsthe memory copy requirement from one buffer to the other, the partial differential equation describing the distribution of heat buffers are swapped at the end of each time step. over time over a given 3D space: Since for the latest GPUs the execution configuration 2 2 2 allows not only for 3D blocks of threads, but also for a 3D T T T (1) Assignment Project Exam Helpgrid of thread blocks, the threads and the thread-blocks are organized both into 3D structures. Thus, each thread of the where α is the thermal diffusivity constant and T represents the grid corresponds to a grid point in the 3-D computational temperature at any point in space (x,y,z) or time (t). domain. To compute the new value at a grid point each thread For the numerical solution of (1) we apply a finite [email protected]. Since global memory operations are very slow, this performs seven global memory read operations at each time difference method on a 3D grid of points. A uniform mesh of represents a severe limitation of the kernel performance. points is used and the forward difference in time and central difference in space (FTCS) method is applied, leading to a 3D In the CUDA architecture, each thread block is divided 7-point stencil: QQ: 749389476into groups of 32 threads called warps, each of which is executed in a SIMD fashion (all threads of the same warp Ti,nj+,1k Δ−tTi,nj,k =α(Ti+n1, j,k − 2ΔTix,nj2,k +Ti−n1, j,k + Ti,nj+1,k − 2ΔTiy,nj2,k +Ti,nj−1,k , (2) execute the same instruction at a time). If the threads inside a warp follow different execution paths, the execution of the ) 0, Δz2 of warp divergence is required to distinguish betweenwhich can be rewritten as: Ti,nj+,1k =Ti,nj,k +d(Ti+n1, j,k +Ti−n1, j,k +Ti,nj+1,k +Ti,nj−1,k + (3) Ti,nj,k+1 +Ti,nj,k−1 − 6Ti,nj,k ), where d = αΔt/Δx2. In the above equation n represents the discrete time step number, (i,j,k) represents the spatial index, Δt is the time step T n and Δx is the mesh spacing, which is equal in all directions. i, j,k represents the temperature value at point (i,j,k), at time step n. The numerical solution is stable if the CFL condition holds: d = αΔt/Δx2 < 1/6. As can be observed in (3), the value at a grid point at time step n+1 is computed from the values at the previous time step, from the same grid point and from its six neighboring points, leading to a 7 point stencil computation (fig. 1). This solution scheme is fully explicit: the computation of the new value at any grid point is fully independent from the computations at the other grid points. boundary and non-boundary nodes, so as to perform the computations only for the latter ones). On the other hand, stencil codes can be characterized by their FLOPs per byte ratio. The baseline implementation performs 13 double-precision floating point operations per update [17]. This leads to 13 · xDim · yDim · zDim operations performed at each iteration (xDim, yDim and zDim represent the grid dimensions). If we assume that at each time step once the old values are loaded they remain in the cache memory (which is unlikely for grid dimensions which exceed the cache size) the amount of data loaded and stored per time step is equal to xDim · yDim · zDim · sizeof(double) · 2. Hence the flop per DRAM byte ratio is: 13⋅ xDim ⋅ yDim ⋅ zDim = 0.8125. (4) xDim ⋅ yDim ⋅ zDim ⋅sizeof (double)⋅2 Current GPUs, however, have a significantly higher ratio. According to this model the performance of the stencil on the GPU is therefore limited by its memory bandwidth. branches is serialized. Thus, warp divergence is another aspectUnlike the 3DBase implementation, for which a thread updates a single point, herein the same thread operates on several grid points. These points are placed equidistant from each other, the distance from one grid point to another is determined based on the size of the 3D domain (xDim · yDim). B. Optimized implementations Next, we describe a series of optimization techniques for 2) Three-dimensional baseline implementation with Shared Memory Usage and No Data Overlap Starting again from the 3DBase implementation, a different shared memory based strategy is developed. The shared memory arrays are padded with an additional slice on each side of the 3D block leading to a total size of (blockXDim + 2) · (blockYDim + 2) · (blockZDim + 2), as shown in fig. 4. First, each thread populates the value of the grid point it handles in shared memory. Next, the threads located on the boundary of the block load the remaining data slices from global memory to the shared memory (note that the corner points of the blocks are not required for the 7-point stencil). To load points located outside of the block, conditional operations are introduced which cause branch divergence. minimizing warp divergence and global memory accesses. Thus, each thread of a thread block has access to all its Besides glob memory, the GPU architecture provides fast neighbors and is able to update the corresponding grid point on-chip memory, registe and shared memory, which is (no overlapping between thread blocks is required – distributed between threads and thread bloc respectively.Assignment Project Exam Help 3DShMNoOverL). 1) Three-dimensional baseline implementation with 3) Two-dimensional distribution of threads with addition Shared Memory Usage and Data Overlap register usage The starting point for the new kernel is the 3DBase The 2DBase implementation can be optimized by storing implementatio Since shared memory is allocated at thread Email: [email protected] data registers. Therein, the value of the current block level, threads can cooperate when populating data grid point for adjacent 2D slic is read from the global blocks allocated in the shared memory. If data can then be memory by the same thread. The same holds tr for grid reused by different threads, global memory accesses are points which lie on the front or back sides of the 2D slices. reduc and overall kernel performance is improved. QQ: 749389476 Because slices are iterated along the z direction, the val Shared memory arrays of size blockXDim · blockYDim · blockZDim are allocated (blockXDim, blockYDim and blockZDim represent the dimensions of the thread blocks). Each thread within a block loads the value of the grid point https://tutorcs.com it handles from global memory to shared memory. To avoid undefined behavior and incorrect results when sharing data read by different threads, a synchronization barrier is introduced. All values required for the implementation of (4) are then read from the shared memory. With this technique, threads lying at the border of a thread block do not have access to all their neighbors and can not compute the corresponding new values. Hence, the execution configuration is designed so as to ensure block overlapping in Fig. 2. 2DBase kernel: the computational grid is divided into x-y planes and a loop is then used to traverse the grid in the z-direction.Fig. 3. 3DShMOverL kernel: the shared memory arrays have the same size as the thread blocks. Thread blocks overlap to enable the computation at all grid points.Fig. 4. 3DShMNoOverL: the shared memory arrays are padded with additional data slices loaded by the threads located at the border of the thread block. at grid point (i, j, k+1) becomes the value at (i, j, k) at the next memory. Next, threads located on the upper and lower iteration. Similarly, the value at (i, j, k) becomes the value at boundary of the block load the remaining values. for caching them and two global memory accesses are saved at (i, j, k–1). Instead of rereading these values, registers are used Besides the two registers that store the values of the nodes located next to the left and right boundaries, another 2 each iteration along the Z axis (in the following this kernel is registers are used for the optimization described in section called 2DReg). II.B.2. 4) Two-dimensional distribution of threads with Shared Memory Usage III. RESULTS As for the kernels with 3D thread blocks, shared memory To evaluate the performance of the different strategies for can also be used to reduce global memory accesses for the running 3D stencil based algorithms on GPUs, we used three kernels with 2D thread blocks. The size of the shared memory different NVIDIA GPU cards: GeForce GTX 480, GeForce array chosen for this kernel version is (blockXDim + 2) · GTX 660M and GeForce GTX 680 (the first one is based on (blockYDim + 2). To allow each thread of the thread block to the Fermi architecture, while the other two are based on the compute the new value of the corresponding grid point, Kepler architecture), and the CUDA toolkit version 5.5. The additional slices are populated at each border of the 2D shared unsteady heat conduction problem was solved on a rectangular memory array. Hence, the size of the shared memory array domain with Dirichlet boundary conditions, whereas the used for this configuration is (blockXDim + 2) · (blockYDim + boundary values were set to 100 C for one side of the 2). Each thread first reads the value of the grid point it handles rectangle and 0 C for the other sides. The thermal diffusivity and stores it in the shared memory. Next, threads located on : -5 2 constant was se to 1.9 · 10 m /s and the computations are the boundary of the block load the remaining values (in the performed until convergence is reached. The numerical following this kernel is called 2DShM). solution was obtained on a grid of 128x128x128 nodes and is 5) Two-dimensional distribution of threads with displayed in fig. 6. The numerical solution was identical for all Additional Register and Shared Memory Usage Assignment Project Exam Helpthree GPU cards and for all implementation versions down to th For the implementation version described in section II.B.4 the 15 decimal, i.e. close to the precision of the double-type the loading of the central section of the shared memory does representation in computer data structures. Fig. 5. 2DShMReg: Northern and southern slices are read from the shared Fig. 6. Computation result for the unsteady heat conduction problem on a memory, eastern and western values from the global memory. rectangular domain with Dirichlet boundary conditions. not introduce any divergent branches since it is not Table 1 displays the execution times for one time step for conditioned. The loading of the slices with [email protected] index equal to 0 the three above mentioned GPU cards, obtained for the seven or blockYDim + 2 introduces a maximum of two divergent different kernel versions introduced in the previous section. branches, one for each half-warp, depending on the compute The GTX660M card leads to the largest execution times capability of the GPU. On the other side, the slices with x although it has been considerably later released compared to index equal to 0 or blockXDim + 2 lead to divergent branches the GTX480 card. This can be explained however by the fact and only one thread of the entire half-warp from the global memory and stored into registers. Only the smallest execution times. The ratio of the execution times for threads lying at the left or right border perform separate global the GTX660M and GTX680 cards varies between 4.26 and 5.56 memory reads (fig. 5 – 2DShMReg), while the other values are for different kernel versions. This roughly reflects the inverse safely read from the shared memory. of the power consumption ratio, which is equal to 3.9. To reduce branch divergence, the shared memory array is consumption of 250W and 195W respectively, the GTX660M used only for the central section and for the slices with index https://tutorcs.comonly required 50W). The GTX680 is the best performing card: equal to 0 or blockYDim + 2, while the other values are read for each of the seven implementation versions it leads to the blockXDim · (blockYDim + 2). Each thread first reads the value of the grid point it handles and stores it in the sharedTABLE I. EXECUTION TIME [MS] FOR A SINGLE TIME STEP, OBTAINED FOR decreased by 35.39% and the total number of read operations THE SEVEN DIFFERENT IMPLEMENTATION VERSIONS ON THREE DIFFERENT was reduced by 25.66% for the 3DShMNoOverL kernel. 程序代写代做 Method GTX4 80 GTX 660M GTX 680 3DBase 1.7 3.45 0.62 3DShMOverL 3.5 6.17 1.13 3DShMNoOverL 1.8 3.78 0.73 2DBase 1.2 3.09 0.63 2DReg 0.9 2.47 0.58 2DShM 1.2 2.87 0.59 2DShMReg 1.09 2.32 0.48 CS编程辅导compute limited instead of bandwidth limited. The main GPU CARDS . Compared to the 3DBase kernel, this implementation is reason for the change of the limitation type lies in the number of divergent branches, which increased considerably and which in the end leads to a higher execution time than for the 3DBase kernel. Next, we refer to the kernels based on a 2D thread block structure. The 2DReg kernel leads to a significant reduction of memory operations (28.34%) and as a result of the execution time (7.93%), compared to the 2DBase kernel. The 2DShM Interestingly, whereas for the GTX660M and the GTX680 kernel further reduces the number of global memory load cards the 2DShMReg kernel performs best, for the GTX480 operations but execution time increases slightly, which is card the 2DReg kernel leads to the smallest execution time. caused by the non-optimized register usage. Finally the Shared memory based optimizations were particularly since the L1 cache is intensively used for caching global IV. CONCLUSIONS memory read operations, the 2DReg kernel outperforms the 2DShMemReg kernel. On the other hand, for the GTX660M In this paper, we have presented performance studies for and the GTX680M, since the L1 cache functionality is limited [email protected] stencil based algorithms on recent NVIDIA GPUs. To our to register spilling, shared memory usage became more knowledge this is the first study to evaluate different important, illustrated by the better performance of the implementation and optimization strategies for double 2DShMReg kernel. precision computations. The increased accuracy obtained for double precision is required in scientific computations, which In the following we focus on the differences between the QQ: 749389476represent the main area of application for the 3D stencil based kernel versions for the GTX680 card, which was determined algorithms. as the best performing one considered herein. Table 2 displays besides the execution time other important details of the For the analysis we have used Fermi and Kepler various kernel versions. architecture based cards, which represent the last two released https://tutorcs.comGPU architectures from NVIDIA. Besides the shift in L1 lead to almost identical execution times. Referring first to the kernels based on a 3D thread block structure, the 3DShMOverL performs worse than the 3DBase kernel: execution time increased by 82% although the number of global accesses was reduced by 66.13%. This can be explained by the fact that a considerable amount of threads perform only load operations. Compared to the 3DShMOverL kernel, the execution time The two baseline implementations (2DBase and 3DBase) cache usage from Fermi to Kepler, other important minor and major changes in the hardware configuration have been performed [4]. Hence, starting from two different baseline implementations (based on 3D and 2D thread block structures), we have applied different optimization strategies which have lead to different performance changes for the Fermi and Kepler cards. Overall the GTX680 GPU card (Kepler architecture) performed best for a kernel with 2D TABLE II. KERNEL PERFORMANCE AND DETAILS FOR THE GTX680 CARD. Method Execution time [ms] Reg. per thread Divergent branches Shared memory per block [bytes] Total number Total of 64 bit global number of load instr. 64 bit global store instr. 3DBase 0.62 25 12016 – 14002632 2000376 3DShMOverL 1.13 19 20811 4096 4741632 2000376 3DShMNoOverL 0.73 21 12694 8000 3524851 2000376 2DBase 0.63 25 94 – 14002632 2000376 2DReg 0.58 25 94 – 10033632 2000376 2DShM 0.59 25 94 800 6953688 2000376 2DShMReg 0.48 25 94 640 2984688 2000376 architecture) the 2D kernel, which does not use shared [4] NVIDIA Corporation, “NVIDIA Kepler GK110 Architecture best, GPUs. [6] T. Shimokawabe, T. Aoki, T. Takaki, T. Endo, A. Yamanaka, N. Maruyama, A. Nukada, and S. Matsuoka, “Peta-scale phase-field Finally, for the Kepler architecture we have evaluated the simulation for dendritic solidification on the TSUBAME 2.0 performance for a GPU designed for desktop PCs (GTX680) supercomputer”, Intern. Conf. for High Performance Computing, Networking, Storage and Analysis, pp. 13-18, 2011. and for a GPU designed for notebook PCs (GTX660M). The [7] G.C. Fox, “Concurrent processing for scientific calculations”, IEEE results have indicated that the ratio of execution time is roughly Computer Society International Conference, pp. 70-73, 1984. equal to the inverse of the ratio of power consumption. [8] D. Reed, L. Adams, and M. Patrick, “Stencils and problem partitionings: Their influence on the performance of multiple processor systems”, IEEE ACKNOWLEDGMENT Trans. Comput., vol. 7, pp. 845-858, 1987. [9] P. Micikevicius, “3D Finite difference computation on GPUs using This work is supported by the program Partnerships in CUDA”. Workshop on General Purpose Processing on Graphics Processing Units, pp. 79-84, 2009. [10] L. M. Itu, C. Suciu, F. Moldoveanu, and A. Postelnicu, “GPU Optimized Priority Domains (PN II), financed by ANCS, CNDI – Computation of Stencil Based Algorithms”, RoEduNet Inter. Conf., pp. This paper is supported by the Sectoral Operational [11] T. Grosser, A. Cohen, P. H. J. Kelly, J. Ramanujam, P. Sadayappan, and Programme Human Resources Development (SOP HRD), S. Verdoolaege, “Split tiling for GPUs: automatic parallelization using trapezoidal tiles”, Workshop on General Purpose Romanian Government. [12] J. Holewinski, L. N. Pouchet, and P. Sadayappan, “High-performance The research leading to these results has received funding code generation for stencil computations on GPU architectures”, ACM Intern. Conf. on Supercomputing, pp. 311-320, 2012. from the European Union’s Seventh Framework Programme We hereby acknowledge the structural funds project PRO- [14] K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. DD (POS-CCE, O.2.2.1., ID 123, SMIS 2637, ctr. No Patterson, J. Shalf, and K. Yelick, “Stencil computation optimization and autotuning on state-of-the-art multicore architectures”, ACM/IEEE 11/2009) for providing the infrastructure used in this work. Conf. on Supercomputing, pp. 1-12, Nov. 2008. [15] C. Niţă, L. M. Itu, and C. Suciu, “GPU Accelerated Blood Flow[16] P. Zaspel, M. Griebel, “Solving incompressible two-phase flows on [1] D. Kirk, and W.M. Hwu, Programming Massively Parallel Processors: A multi-GPU clusters”, Computers & Fluids 2012, vol. 80, pp. 356-364, [2] NVIDIA Corporation, “CUDA, Compute unified device architecture [17] N. Maruyama, and T. Aoki, “Optimizing stencil computations for Computations, pp. 1-7, Jan. 2104.
There are two problems in this assignment: Problem 1 (stack-based exploitation). You are given a binary (implementing a parser for Intel HEX format for binary files), containing one or more vulnerabilities. You are asked to exploit these vulnerabilities, using stack-based exploitation methods, to achieve a number of objectives, culminating in an arbitrary code execution. There are 4 (four) subproblems within this problem, with increasing difficulty. The subproblems are designed in a way that helps you in building exploit primitives that would eventually lead to arbitrary code execution. Problem 2 (heap-based exploitation). There are two binaries in this problem, with similar functionalities but different exploit mitigations in place. You are asked to find heap-related vulnerabilities (use-after-free and double-free) and exploit them to achieve a number of objectives. This problem is divided into 7 (seven) subproblems with increasing difficulty. Just as in Problem 1, these subproblems are designed to help you building the necessary primitives for the final goal of achieving arbitrary code execution. To see the details for each problem, read the instructions in the respective folders (./stack for Problem 1 and ./heap for Problem 2). There are two components to this assignment: Artefact: This consists of python scripts implementing the exploitation steps. Report: This is a PDF document explaining how you solve the assignment problems. The report file must be named following the convention: _report.pdf, e.g., u1234567_report.pdf. Limit your report length to around 3500 words. Execution environment This assignment assumes all the associated binaries are run inside the lab VM. You need to install two additonal software packages to run and test these binaries: socat and libreadline-dev. Install them in the lab VM using the following commands: bash $ sudo apt install socat libreadline-dev Testing your solution scripts Each subproblem requires you to write a python script to launch the exploit. For the purpose of assessment, you must assume that the binaries will be run in a remote server, so in particular you have no direct visibility into the program states (so, e.g., you cannot run gdb and inspect the buffer addresses or libc addresses while executing your attacks). To assist you in testing your solution, to make sure that they comply with this requirement, each problem comes with a script to run the binary as a local server application that interacts only through the (localhost) network socket. Your submitted solution scripts must interact with the binaries only through the network sockets. This can be done through pwntools’s remote() command. You are, however, free to use the process() command to interact with the binaries while you are developing your solutions, but in the final submission, the process() command must be replaced by the remote() command. COMP3703 assignment 2 : Code Help Programming Help :#The Solution needs to be customised that’s why we didn’t attach the solution For the Programming Help for this solution email or whatsapp me at: [email protected] Whatsapp : +1 419 877 7882