Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Cs6262 project 3- malware analysis

CS 6262Sections: 1. Window Malware Analysis 2. Linux Malware Analysis 3. Android Malware Analysis 4. Tips for assignment-questionnaire.txt 5. Miscellaneous VM Performance Tips 6. Submission Windows Malware Analysis Scenario: You got a malware sample from the wild! Your task is to discover what the malware does by analyzing it.How do you discover the malware’s behaviors? There are multiple ways of analyzing it but we’ll be focusing on two ways: Static Analysis and Dynamic Analysis.Static Analysis: ● Manual Reverse Engineering ● Programming binary analysisDynamic Analysis: ● Network behavioral tracing ● Runtime system behavioral tracing (File/Process/Thread/Registry) ● Symbolic Execution ● FuzzingIn our scenario, you are going to analyze the given malware with tools that we provide. These tools help you to analyze the malware with static and dynamic analysis. Objective: 1. Find which server controls the malware (the command and control (C2) server) 2. Discover how the malware communicates with the command and control (C2) server a. URL and Payload 3. Discover what activities are done by the malware (Attack activities) Requirement: 1. Make sure that no malware traffic goes out from virtual machine 2. The command and control server is dead, so YOU need to reconstruct it a. Use tools to reconstruct the server and then reveal hidden behaviors of the malware 3. Analyze network traffic on the host and figure out the list of available commands for the malware 4. Analyze network traffic and program trace of the host, and figure out what malware does 5. Write down your answer into assignment-questionnaire.txtProject Structure: ● Make sure to install/update to the latest version of VirtualBox ○ https://www.virtualbox.org/wiki/Downloads ● Download the Virtual Machine (VM) ○ https://www.dropbox.com/s/dnk6acztw9ewp83/Project%203.zip?dl=0 ○ Unarchive the file with 7zip and password is cs6262 ● Network Configurations: ○ tap0: ■ Virtual network interface for Windows XP • IP Address: 192.168.133.101 ○ br0 ■ A network bridge between Windows XP and Ubuntu ● IP Address: 192.168.133.1 ○ enp0s3 ■ A network that faces the Internet ● IPAddress:10.0.2.15 (it varies with your VirtualBox settings)● Open VirtualBox ○ Go to File → Import Appliance ○ Select the ova file and import it ○ For detailed information on how to import the VM, see:○ Before starting, it might be useful to configure the settings, allocate more base memory, processors etc. to your VM, as per your device configurations for better performance. ● VM user credentials ○ Username: analysis ○ Password: analysisNOTE: VM Setup● For M Series Mac Users: ○ Please install the latest version of UTM (https://mac.getutm.app/) and follow the instructions in the link to import and set up the VM.● In the Virtual Machine: ○ Files ■ init.py ● This initializes the project environment Name) after running this • $./init.py ■ update.sh ○ Please run this script when you start the project! (If it says that you’re already updated when you run it, that’s fine) ○ If you have already completed stage 1 before running update.sh, you do NOT need to redo stage 1 – but you will need to run update.sh to complete stage 2 ■ archive.sh ● This will archive the answer sheet for submission (create a zip file) ○ Directories: ■ vm ● A directory that stores the Windows XP virtual machine (runs with QEMU) ● We use the given VM for both Cuckoo and a testbed. ■ shared ● A shared directory between the Ubuntu host and Windows guest (XP is running on a VM within your project VM). You can copy/move files to or from this directory. ■ report ● The answer sheet for project questionnaire ■ setup ● Required files for setting up the machine. You don’t need to modify, nor use the files in this directory. ■ Tools ● network ○ Configure your network firewall rules (iptables) by editing iptables-rules. ○ You can allow/disallow/redirect the traffic from the malware ○ ‘./reset’ command in this directory will apply the changes ● cfg-generation (CFG stands for Control-Flow Graph) ○ An analysis tool that helps you to find interesting functions of malicious activity ○ You need to edit score.h to generate the control-flow graph ○ Use xdot to open the generated CFG. ● Sym-exec ○ A symbolic executor (based on angr : https://github.com/angr) ■ Helps you to figure out the commands that malware expects ○ Use cfg-generation tool to figure out the address of the function of interests ● c2-command ○ A simplified tool for C2 server reconstruction ○ You can write down command in the *.txt file as a line ○ It will randomly choose command at a time to send to the malware ○ Malware: ■ stage1.exe – stage 1 malware ● It will download the stage 2 malware if this malware receives the correct command ■ stage2.exe – stage 2 malware ● It will download the stage 3 malware if this malware receives the correct command ■ payload.exe – the linux malware attack payload ● Analyze the dynamic instruction trace ● Write a script to detect where the C&C communication happens – Find the loop entry point and function sequence in the loop ● Add constraint to symbolic execution to limit the loop to one ● Find the feasible attacks within a given set of possible attacks. Tutorials: ● stage1.exe malware ○ Update the project 3 before begin ■ Open the terminal (Ctrl-Alt-T, or choose terminal from the menu) ■ Run ./update.sh ● It will update any necessary files that are required for this project.○ Initializing the project ■ Open the terminal (Ctrl-Alt-T, or choose terminal from the menu) ■ Run ./init.py ● This will download the stage1 malware (stage1.exe) into the ~/shared directory○ Note: ● It is likely that security measures would kick in and encrypt these files ○ That is all the malware samples you will be downloading during this project ■ IMPORTANT ● After each download, make sure to check the type of file ● In the linux VM, execute $ file ● If the result of that is an archive of some sort then execute: unzip ○ Password: infected ● For stage1 and stage2, the file format should be● For stage3, the file format should be● Secure Experiment Environment ○ We need a secure experiment environment to execute the malware ○ Why? ● Encrypting your file during a ransomware analysis ● Infecting machines in your corporate network during a worm analysis ● Creating a tons of infected bot client in your network during a bot/trojan analysis ○ The solution: ■ Contain malware in a virtual environment ● Virtual Machine ● Virtual Network ○ Conservative rules(allow network traffic only if it is secure) ○ We provide a Win XP VM as a testbed! ● Run Win XP VM ○ Run Windows XP Virtual Machine with virt-manager ○ Open a terminal ○ Type “virt-manager” and double click “winxpsp3” ○ Click the icon with the two monitors and click on “basecamp”○ Right click on basecamp, and click “Start snapshot.” Click Yes if prompted. ○ Once, virt-manager successfully calls the snapshot, click Show the graphical console. ■ Click on the Windows Start Menu and Turn off Computer. ■ Then select Restart○ DO NOT MODIFY OR DELETE THE GIVEN SNAPSHOTS! ■ The given snapshots are your backups for your analysis. ■ If something bad happens on your testbed, always revert back to the basecamp snapshot. ● Copy from Shared Directory ○ Go to the shared directory by clicking its icon (in Windows XP) ■ Copy stage1.exe into Desktop ■ If you execute it in the shared directory, the error message will pop up. Please copy the file to Desktop.● Run the malware ○ Now we will run the malware ■ Execute stage1.exe (double click the icon) ■ It will say “Executing Stage 1 Malware”. Then, click OK. ● You should click OK on each dialog to dismiss it ○ Otherwise, malware execution will be blocked ○ If you want to halt the malware that is running… ■ Execute stop_malware in the temp directory. ● This will stop the currently running malware. ● Please halt first before you execute another malware file. ● Network Behavioral Analysis…….. ○ To analyze network behaviors, you need ■ Wireshark (https://www.wireshark.org/) ● Network Protocol Analyzer ○ Cuckoo (https://cuckoosandbox.org/) ■ Capturing & Recording inbound/outbound network packets ● Observing Network Behavior ○ By capturing and recording network packets through the tools ■ Reveal C&C protocol ■ Attack Source & Destination ○ But, malware will not do anything. Why? ■ The C2 server is dead! ■ Therefore, the malware (C2 client) will never unfold its behaviors. ■ Question? ● If we know C&C dialog of malware, can we build a fake C2 server in order to unfold the malware behaviors? ● Answer: Hack Yeah! That is your job for this project! ● Wireshark ○ Let’s check it through network monitoring ■ Everything has been already installed. ■ Open Wireshark, capture the traffic for the network bridge (Make sure to run with root privileges) ■ IP address = 192.168.133.1 ■ Reference: https://www.wireshark.org/docs/ ■ Get yourself familiarized with Linux commands and how to employ Wireshark. ■ Other references: ● https://www.wireshark.org/docs/wsug_html_chunked/ChapterIntro duction.html ● https://www.varonis.com/blog/how-to-use-wireshark ● Redirect Network Connection ○ From WireShark, we can notice that the malware tries to connect to the host at 128.61.240.66, but it fails ○ Let’s make it redirect to our fake C2 server ■ Go to ~/tools/network ■ Edit iptables_rules to redirect the traffic to 128.61.240.66 to 192.168.133.1 (fake host) ○ Whenever you edit iptables_rules, always run reset. ■ (type “./reset” from the ~/tools/network directory) ○ IMPORTANT! If you shut down your project VM, be sure to run reset again the next time you start it up.● Reading C2 Traffic ○ Observing C2 traffic ■ In WireShark, we can notice that now the malware can communicate with our fake C2 server● But there will not be further execution, because the command is wrong ■ You can see the contents of the traffic by right-clicking on the line, then clicking Follow – TCP Stream● Cuckoo ○ Let’s take a look at cuckoo. Cuckoo is NOT necessarily required to complete this project, but it is a useful tool to help you understand what your malware is doing, and therefore how you might want to modify your score.h file later in the project. ○ Note! You can’t run the testbed VM and cuckoo simultaneously. ○ Always turn off the testbed VM, and follow the steps below to execute Cuckoo○ Open two terminals. ○ ‘$workon cuckoo’ (Set virtualenv as cuckoo for both terminal1 and terminal2) ○ Open one terminal in debug mode, with command: ‘$cuckoo -d’ ○ Open other cuckoo terminal for the webserver, with command: ‘$cuckoo web’○ Reference: Malware Analysis using Cuckoo Sandbox ○ If you get an error when running cuckoo web because port 8000 is already in use, run “sudo fuser -k 8000/tcp” and try again.○ The Cuckoo uses a snapshot of the given testbed VM. ○ The snapshot is 1501466914 ○ • DO NOT TOUCH the snapshot!● Upload a file to Cuckoo ○ To open the cuckoo web server, type the following URL into Chromium ■ http://localhost:8000 ○ To upload a file, click the red box and choose a file.○ Once you click the Analyze button, it will take some time to run the malware.● Analysis with Cuckoo ○ Once you click the Analyze button, it will take some time to run the malware.● Figuring Out the List of Commands ○ The malware does not exhibit its behavior because we did not send the correct command through our fake C2 server ○ We will use ■ File/Registry/Process tracing analysis to guess the malware behavior. ■ control-flow graph (CFG) analysis and symbolic execution to figure out the list of the correct commands ○ The purpose of tracing analysis is to draw a big picture of the malware ■ What kinds of System call/API does the malware use? ■ Does the malware create/read/write a file? How about a registry? ○ The purpose of CFG analysis is to find the exact logic that involves the interpretation of the command and the execution of malicious behavior ○ Then, symbolic execution finds the command that drives the malware into that execution path● Tracing Analysis on Cuckoo ○ On the side bar, there are useful menus for tracing analysis. ■ We are focusing on: ● Static Analysis ○ API/System Call.● Behavioral Analysis ○ Trace behaviors in time sequence. ● Static Analysis on Cuckoo ○ Static Analysis ■ Information about the malware. ■ Win32 PE format information ● Windows binary uses the PE format ● Complicated structure ● Sections includes ○ .text ○ Strings, etc. ○ .data ○ .idata ○ .reloc ○ More information: Malware researcher’s handbook (demystifying PE file) ○ Interestingly three DLL(Dynamic Link Libraries) files are imported. ○ In WININET.dll, we can see that the malware uses http protocol. ○ In ADVAPI32.dll, we can check if the malware touches registry files ○ In Kernel32.dll, we can check the malware waiting signal, also sleep.● Behavior Analysis on Cuckoo ○ Tracing a behavior(file/process/thread/registry/network) in time sequence. ○ Useful to figure out cause-and-effect in process/file/network. ○ Malware creates a new file and runs the process, then writes it to memory.● Cuckoo analysis result ○ Based on our analysis with Cuckoo, we can determine if… ■ The malware uses HTTP protocol to communicate ● Communicate with whom? C&C? ● Web server access? For checking if the C2 server is active? ● Commands through http protocol? Cookies? ■ The malware touches(create/write/read) a file/registry/process ● This might be a dropper? Or does it download a binary from the C2 server? ● What is the purpose of creating processes? Modifying the registry?● Control Flow Graph Analysis ○ Based on the pre-information that we collected from the previous step, we are going to perform CFG analysis & symbolic execution analysis ○ CFG: ■ graph representation of computation and control flow in the program ■ Nodes are basic blocks ■ Edges represent possible flow of control from the end of one block to the beginning of the other.○ But, in malware analysis, we are analyzing CFG at the instruction level. ○ We provide a tool for you that helps to find command interpretation logic and malicious logic ■ We list the functions of system calls the malware uses internally ■ If you provide the score (how malicious it is, or how likely the malicious logic is to use such a function) for the functions, then the tool will find where the malicious logic is, based on its score ● Example: if you set StrCmpNIA to have a score of 10, then the function that calls StrCmpNIA 5 times within itself will have the score 50. ● A higher score implies that more functions related to the malicious activity are used within the malware. ■ Your job is to write the score value per each function○ More info: http://www.cs.cornell.edu/courses/cs412/2008sp/lectures/lec24.pdf ○ From our network analysis, we know that the malware uses an Internet connection to 128.61.240.66 ○ From our cuckoo-based analysis, we know that the malware uses the HTTP protocol. ○ Moreover, it uses some particular functions to communicate and stay in touch with the command and control server. ○ Modify the score values for these particular functions in order to generate a better CFG – for proper analysis. ○ Find the file to be edited – score.h. ○ Path: /tools/cfg-generation/score.h ○ Build control flow graph ■ By executing ./generate.py stage1, the tool gives you the CFG ● This finds the function with higher score ○ Implies that this calls high score functions on its execution ■ For stage2 ● Use ’stage2’ as argument ○ Note: your graph and its memory addresses will vary from this example ○ The function entry is at the address of 405190 ■ And, there is a function (marked as sub) of score 12 ● At the address 40525a (marked in red) ● Use the block_address, not the call sub_address ■ This implies that ● sub_4050c0 calls some internet related functions. ● We need to find out what this command is ○ Run from 405190 to 40525a ● Finding Command ○ Finding Commands with Symbolic Execution ■ We want to find a command that drives malware from 405190 to 40525a ● Let’s do symbolic execution to figure that out ○ What is symbolic execution? ■ Rather than executing the program with some input, symbolic execution treats the input data as a symbolic variable, then tries to calculate expressions for the input along the execution. ■ Path explosion ■ Modeling statements and environments ■ Constraint solving ○ Symbolic Execution Engine: Klee, Angr, Mayhem, etc. • Loading a binary into the analysis program ○ • Translating a binary into an intermediate representation (IR). • Translating that IR into a semantic representation ○ • Performing the actual analysis with symbolic execution.○ In this example, ONLY i=2, j=9 conditions will lead the program to print “Correct!” ○ Symbolic execution is available to solve the expression in order to reach a target, in this case ”Correct”. ○ Let’s apply it into Malware Command & Control logic. A C&C bot(malware) is expecting inputs(solve the expressions) to trigger behaviors(targets).○ In this example, ONLY ‘launch-attack’ and ‘remove’ commands(inputs) triggers attack() and destroy_itself(). ○ Symbolic execution is able to find ”launch-attack” as an input to trigger attack(), which is a malicious behavior. ○ Plus, ”remove” will lead to destroy_itself(), which is another behavior. ○ Our job in this project with Symbolic execution is to find inputs, and then feed the inputs to trigger behaviors.● Finding Commands with Angr ○ We prepared a symbolic executor and a solver for you ■ Your job is to find the starting point of the function which interprets the command, and find the end point where malware actually executes some function that does malicious operations ● Use a Control-flow Graph (CFG) analysis tool! ■ The symbolic executor is called angr (http://angr.io/index.html) ○ We prepared a symbolic executor and a solver for you. ○ How do you run it? ■ Go to ~/tools/sym-exec ■ Run it like python ./sym_exec.py [program_path] [start_address] [end_address] ○ Replace the (above) start and end addresses from your CFG graph. python ./sym_exec.py ~/shared/stage1.exe 4050c0 40518a ○ The command will be printed at the end (if found)● Reconstructing C2 server ○ After CFG analysis + symbolic execution, reconstruct the C2 server ○ The tool for reconstructing the C2 server is already on the VM ○ It runs nginx and php script ■ This will look like ~/tools/c2-command/stage*-command.txt ■ Your job is to add your commands to the relevant *.txt file ● The command that leads the execution from 405190 to 40525a is ● Important: be sure to put the ‘$’ character before your commands, even if stage* – command.txt says that it’s optional ● The order of commands in the file does not matter – they’ll run in a random order ○ Note: This means that if you want to run only a particular command, you’ll need to remove, or comment out the other commands in your file● angr ○ SimState ■ angr – SimState ■ While angr perform symbolic execution, it stores the current state of the program in the SimState objects. ■ SimState is a structure that contains the program’s memory, register and other information. ■ SimState provides interaction with memory and registers. For example, state.regs offers read, write accesses with the name of each registers such as state.regs.eip, state.regs.rbx, state.regs.ebx, state.regs.ebh ■ Creating an empty 64 bit SimState○ Bitvectors ■ Since, we are dealing with binary files, we don’t deal with regular integers. ■ In binary program, everything becomes bits and sequence of bits. ■ A bitvector is a sequence of bits used to perform integer arithmetic for symbolic execution. ■ Creating some 32 bit bitvector values ■ state.solver.BVV(4,32) will create 32 bit length bitvector with value 4 ■ We can perform arithmetic operations or comparisons using the bitvectors○ Symbolic Bitvectors ■ state.solver.BVS(’x’, 32) will create a symbolic variable named x with 32 bit length ■ Angr allows us to perform arithmetic operation or comparisons using them.○ Registers ■ State provides access to the registers through state.regs.register_name where register_name could be rcx, ecx, cx, ch and cl. Same applies to the other registers. ■ Look at the types of registers — they are bit vectors ■ Look at the length of registers examined below. ● They are all symbolic bitvector because they are not initizlized yet. ■ For cl, ch, cx and ecx they are all part of rcx. ■ You can compare the length and the location of cl, ch, cx, ecx and rcx in angr with the actual architecture depicted below.○ Constraints ■ In a CFG, a line like if ( x > 10 ) creates a branch. Please look at the Symbolic Execution Concepts tutorial. ■ Assuming x is a symbolic variable, this will create a 4> when the True branch is taken for the successor state ■ For the false branch,negation of a 4> will be created. ■ Adding a constraint to a SimState ● Cl register equals to 11 ● state.add_constraints(state.regs.cl == 11) ● state.add_constraints(state.regs.cl == state.solver.BVV(0xb, 8) since state.solver.BVV(0xb, 8) equals to 11 ● You can see their effect is the same for SimState in the example below.○ Radare2 ■ Launch radare2 with $ r2 ~/shared/payload.exe ■ Then type aaa which will analyze all (functions + bbs)■ afl list all functions■ afl lists all the functions which are hard to analyze. ■ afl~name grep the list of functions with given name ■ afl~attack will list all the functions having attack■ You can use linux commands while inside the r2 console such as grep. ■ On the right side, you can see all the functions having the attack vector (afl~send) ■ Using those api calls, this linux malware performs DDoS attacks based on the commands they receive from C&C server. ■ The example below shows how to find all the attack vectors calling sym.send/sym.sendto ■ Now, we have to iterate all the attack functions on the right. For example, the example below shows three attack functions, and only one of them is called. Our focus is the call sym.attack_????? functions. ■ Let’s analyze the example below. ■ axt sym.attack_app_http has only one reference which is a push instruction. This is not the attack function we are interested in. ■ axt sym_attack_app_cfnull has no reference at all. This is not the attack function we need to explore. ■ axt sym_attack_???? Is one of the functions listed on the right example, and have call sym.attack_????? Instruction. That is the function we need to explore more to determine the target address for the symbolic execution. ■ You need to find 2 attack functions.■ After finding the attack function, we can determine the target address. ● First, step into the function using s sym.attack_????. ● Second, pdf | grep sym.send or pdf | grep sym.sendto to determine the instruction address ● Third, s address_for_call_sym.send(to) to point to the instruction which is call sym.send or sym.sendto ● Lastly, print 2 instructions starting with the call sym.send/sym.sendto instruction ■ The address of the instruction which is the successor of call sym.send(to) is the target address for the symbolic execution.■ For more information : ● https://github.com/radare/radare2 ● https://www.radare.org/get/THC2018.pdf ● Other Tools: ○ You don’t have to use Radare2. ■ objdump ■ IDA-Pro (Dissambly tool with GUI) (Free version) ● https://www.hex-rays.com/products/ida/support/download_freewar e.shtml ■ Cutter (GUI for the radare2) ● https://www.radare.org/cutter/ ● https://github.com/radareorg/cutter After stage1.exe ● If you find all of the commands for stage1.exe malware, the malware will download stage2.exe by updating itself. ● Now you’ve found the commands from running sym-exec.py ● Add those commands to stage1-commands.txt. Remember to put $. ● Start up the windows VM again, then copy stage1.exe to the desktop. Then double click on it and continue. ● Note if stage1 fails to download stage2, your firewall might be blocking it ○ This is actual malware so some IDS have signatures that match it.● For stage2.exe, please follow the same steps in the tutorial ○ Check its network access with Wireshark ○ Redirect network traffic to if required (if the connection fails) ○ Try to identify malicious functions by editing score.h and using the cfg-generation tool ○ Discover the list of commands using the symbolic execution tool ○ Fill the commands in ~/tools/c2-command/stage2-command.txt ○ Run it as mentioned before.Linux Malware Analysis● Stage2.exe will download stage3 malware, which is payload.exe. ○ This is Linux Malware. ● We need to handle the linux malware differently unlike windows malware, and will use different tools and methods to analyze this malwareLinux Malware Tools ● First copy the linux malware into a shared folder. The tools which you will use are installed inside the Linux host. ● ~/tools/sym-exec/linux_sym_exe.py ○ for linux malware symbolic execution ○ python linux_sym_exec.py path_to_linux_mw start target ○ To make it work, you need to modify two linux_sym_exec.py functions ■ targs_len_before and opts_len_before ● ~/tools/dynamicanalysis/ ○ instrace.linux.log : the dynamic instruction trace for the linux malware ○ detect_loop.py : you have to modify this file to find the loop in the given trace ○ Usage: python detect_loop.py ● Run ‘python linux_sym_exec.py path_to_linux start target’. ● It won’t be able to find any input because of path explosion. You need to add constraints to make symbolic execution targeted ● Follow the steps in assignment- questionnaire.txt and find the inputs. ● Analyze the dynamic instruction trace and locate the C&C communicationAndroid Malware Analysis ● Manifest Analysis ○ Identifying suspicious components ● Static Analysis ○ Search for C&C commands and trigger conditions ○ Vet the app for any anti-analysis techniques that need to be removed. ● Dynamic analysis ○ Leverage the information found via static analysis to trigger the malicious behavior.Manifest Analysis ● Identify suspicious components ○ Broadcast receivers registering for suspicious actions. ○ Background services ● Narrow the scope of analysis ○ Malicious apps are repackaged in benign apps with thousands of classes.Static Analysis ● Search for C&C commands and trigger conditions● Identifying Anti-analysis techniquesScenario ● Analyzing Android Malware ○ You have received a malware sample sms.apk. ○ You need to identify communication with the C&C server ○ Identify anti-analysis techniques being used by the app. ○ Identify commands that trigger any malicious behavior. Project Structure ● Android emulator ○ An emulator for Android 4.4 is pre-installed ■ Run ‘run-emulator’ ● This will start the Android emulator (this takes along time, especially the first time you start it) ○ Jadx ■ Disassembles apk files into Java source code. ● Apktool ○ Disassembles apk file into Smali. ○ Rebuilds apk files. ● Write-up (~/Android/MaliciousMessenger/writeup.pdf) ○ Detailed guide on how to complete the Android section of the lab. ● Android App ○ ~/Android/MaliciousMessenger/tutorialApps ■ Emu-check.apk ● A tutorial example (Shown as ‘My application’ in the emulator) ○ CoinPirate.apk ■ Another tutorial example ● ~/Android/MaliciousMessenger/sms.apk ○ Target app to analyze to answer the questionnaire ● READ ~/Android/MaliciousMessenger/writeup.pdf Android CheatsheetTips for assignment-questionnaire.txt ● Please use the latest version of VirtualBox when you import the VM. Please do not modify anything related to network settings in the VM. ● Domain name ○ On the questionnaire sheet, there are entries for writing domain names. Please follow the following rules on getting answers for those questions. ○ You should write FQDN, which means, if the full domain name is canof.gtisc.gatech.edu then write canof.gtisc.gatech.edu, not just gatech.edu or gtisc.gatech.edu ○ For the others (connections check, DDoS, sending info, etc.), you should get the exact domain name that the malware uses. For example, the IP address 130.207.188.35 belongs to both coe.gatech.edu and web-plesk5.gatech.edu. ○ Because there are multiple mappings, you cannot be sure about which domain that the malware used by just using nslookup. In this case, please go through the other way of getting domain names from DNS Packets in Wireshark. ○ All Domains should be based on Wireshark DNS packets ■ e.g., get it from a DNS query packet or redirect HTTP traffic into a local VM and examine the Host header. ○ If you get see the log in the Wireshark, You will find DNS query(Standard query) and DNS response(Standard query response) ○ In Domain Name System section, there is Query section, like below ○ Queries: ■ x.y.z: type A, class IN. ○ Answers: ■ x.y.z: type CNAME, class IN, cname a.b.c ○ You should use x.y.z ● URL ○ For all URLs, you do not have to specify the protocol (http:// or https://, etc.). ○ However, if HTTP traffic is like the following: ■ POST /a/b/c/d?asdf=1234 HTTP/1.1 Host: www.zzz.com ○ Then please write this as ■ www.zzz.com/a/b/c/d?asdf=1234 ● Writing commands in *.txt files under c2-command directory ○ There are pre-installed PHP scripts in the VM locally that read the *.txt file for each stage, ■ These scripts send the command to the malware after reading them from the TXT files. ■ One caveat of these scripts is that they are written to send the commands in random order (i.e., if there are commands a, b, c, then the script will randomly choose one command and send it to the malware). ■ So if you want to test ONE command at a time, then please write only that command in the TXT file. ● Ex. If you just want to run the command $uninstall, then please write only that command in stage1- command.txt. ● linux_sym_exec and detect_loop for linux malware ○ You could use free IDA-Pro, objdump or radare2 for this task to find out called attack functions, and the target addresses. ○ Look for some angr examples on the github, which adds constraints to the state. ○ For the loop detection, focus on function sequence that called repetitive ● Correct command but malware is not working? ○ Note that some commands for stage 2 are different per each student, by having 4 digit hexadecimal numbers at the end of the command. ■ Ex. a command for stage 2 is formatted like $COMMANDa1b4 ■ (NOTE: three commands in stage 2 have the 4 digit hexadecimal tail. ■ All commands in stage 3 have the 4 digit hexadecimal tail on the command. ○ However, there could be a case that only gets the front part of the command like ■ $COMMAND ■ If the endpoint address of symbolic execution is not correctly set. In such a case, please set the correct end point that you can get the entire command. ● Cuckoo ○ In the VM, we provide cuckoo, which is a dynamic malware analysis framework. ■ It is very convenient and easy to use. ■ While you are running cuckoo, you might meet some warnings and errors “critical time blah blah~” and “YARA signature…. blah blah”. Please ignore them. ■ Because you are executing malware in the QEMU Windows VM, the framework needs to set a time. ● Cuckoo will check if the malware is terminated or not. ● However, the three malware you will meet are never going to be terminated (intentionally, modified by me for educational purposes.) ● So, please ignore “critical time blah blah~, terminating. ■ In our case, the malware is never going to unfold even though you give an infinite time to be executing the malware unless you feed the right inputs (The malware expects C2 commands.) ○ IPtable Setting ■ If you check /home/analysis/.cuckoo/conf/kvm.conf, you will find how we set the QEMU windows host VM. ■ You will find the IP of the host VM is “192.168.133.101”. ■ If you want to see network behaviors in Cuckoo, you want to forward the IP in /home/analysis/tools/network/iptables- rules. ■ For example, open iptables-rules, you want to add sudo iptables -t nat -A PREROUTING -p tcp -s 192.168.133.101 -d [DEST-IP] –dport 80 -j DNAT –to 192.168.133.1:80Miscellaneous VM Performance Tips Part 1 : Windows Malware / Generic VM Issues ● Try lowering your screen resolution ● Save often! ● Avoid using a resource heavy IDE like IntelliJ, Eclipse etc. Lightweight alternatives include gedit, vim, emacs, Sublime Text, Visual Studio Code, nano, etc ● Most importantly, do / run only 1 task at a time. That means: ○ Run the Windows VM only when: ■ Sending commands to malware ■ Analyzing network traffic via Wireshark ■ Once done with those tasks, turn off the Windows VM. ○ Avoid running the windows VM when: ■ Running cuckoo analysis ■ Generating CFGs ■ Running Symbolic Execution – This is quite resource intensive, avoid doing other stuff to get this done quickly. (TIP: If this seems to be taking infinite memory/time, you’re mostly trying to reach an unreachable / invalid address! check your addresses!) ○ Try running the VM at a lower resolution (recommend at-least 1280×800, for legibility) – If you have a very high resolution on your host machine. You can do this in 2 ways: ■ VirtualBox Menu – View > Virtual Screen 1 > Resize to a x b ■ Ubuntu Menu – Type “Displays” > Change it there ○ Restart after a task / stage. This is mostly a last resort but restarting the VM after finishing a task/stage made everything feel really smooth, instead of trying to free memory etc. Just be sure to run ./reset in ~/tools/networks after each VM restart!Part 2: Android ● Some of the above stuff applies here (VM Settings, resolution, etc). ● Restarting after working on Part 1, helps a lot. ● If you still really feel your android emulator is slow you can add the following flags to the emulator command flags in ~/bin/run-emulator -memory 2048 -gpu swiftshader ● You can experiment with RAM allocation and CPU usage based on your machine – but keep in mind that the project VM has only been tested at 4 GB and with 2 or 3 CPUs.Extra Tips ● Once you successfully complete the stage1 part, and the stage2 file is downloaded on the Windows Vm, you can move it to the shared folder, for better handling. Verify the file type as mentioned in the write-up before, and handle it in the same manner as stage1. ● For stage2, do not forget to update the ‘iptables_rules’ files, and run ‘./reset’ after it. ● General tips – If your device frequently lags, or takes a long time to execute, reboot your device. ○ Fewer resource allocation could result in some issues, you could try to reinstall the VM image (deleting the previously stored state), and even Virtual-box as a last resort. ● Do NOT change the base snapshots. ● Ensure you have set up no firewalls. ● Some particular MAC users might be unable to unzip the project3.zip to obtain the .ova file, in which case login into DropBox as a user, instead of a guest. Verify the file properties afterwards. ● For all users – a partial file download will result in errors. Verify once before execution. ● Moreover, if you have a problem with your current device, (it’s too old or cannot allocate proper resources for a smoother experience), please contact us beforehand so we can arrange for an alternative, we cannot provide one in the last few days. Submission Required files ● Zip the following files and upload report.zip to Canvas ○ Running ~/archive.sh will automatically zip all of the files ■ ~/report/assignment-questionnaire.txt ■ stage1.exe, stage2.exe, payload.exe (linux malware) ■ ~/tools/network/iptables_rules ■ ~/tools/cfg-generation/score.h ● Running ~/archive.sh will create report.zip automatically. ○ Please check the content of your zip file before submitting it to Canvas ● Submit only ‘assignment-questionnaire.txt’ to Gradescope, the report.zip to Canvas (under Project3 Assignment). If you did not submit report.zip on time, a 5-point deduction will be applied to your total score. Questionnaire ● To get credit for the project, you have to answer the questionnaire, found at on canvas ○ Read assignment-questionnaire.txt ○ Carefully read the questions, and answer them in assignment-questionnaire.txt ○ For each stage, there are 4-6 questions regarding the behavior of the malware. ● Android Part ○ READ ~/Android/MaliciousMessenger/writeup.pdf ○ Carefully read the writeup, answer in assignment-questionnaire.txt ○ Make sure you overwrite ANSWER_HERERubric ● The value for each max score is within its particular section ○ Windows has 110 possible points ○ Android has 100. ○ As each section is worth an equal amount of your overall P2 grade, we normalized the Windows score by dividing by 1.1 (and rounded up), then averaged it with the Android score to get your final grade. So effectively, each point in the table above is worth half a point of your final project grade (slightly less for Windows). ● If the Partial Credit column is blank, there is no partial credit for the question. “Ratio” refers to Levenshtein ratio, it’s a metric of similarity between strings

$25.00 View

[SOLVED] Cs6238 project iii-exploring set-uid

CS 6238: Secure Computer Systems Exploring Set-UIDProject Objectives Set-UID is an important mechanism in Unix operating systems. When a Set-UID program is run, it assumes the program owner’s privileges. For example, if the program’s owner is root, when anyone having execute permission runs this program, the program gains the root’s privileges during its execution. Set-UID allows us to do many useful things, but unfortunately, it can also be exploited in several ways. Therefore, the objective of this project is two-fold:1. Appreciate Set-UID’s good side by understanding why Set-UID is needed and how it is implemented 2. Be aware of its bad side by understanding potential security problems it can lead to when used improperly.One quick point to keep in mind as you work on various tasks – If a question asks whether your program is running as a root, it assumes that you will run the program as a CS6238 user, not root (we all know a program will run with root privileges when launched in the root terminal).Project Tasks This is an exploration lab. Your main task is to experiment with the Set-UID mechanism in Linux and write a lab report that describes your findings. You will be provided with a VM to complete the tasks. The VM will be shared on Canvas and mirror links will be made available. The password for the “cs6238” user is “cs6238” and to login as root use the command “sudo su” with password “cs6238”. You are required to accomplish the following tasks in Linux:1. Task One (10 points)(a) Why do “passwd”, “chsh”, “su”, and “sudo” commands need to be Set-UID programs. (4 points) (b) What will happen if they are not? (If you are not familiar with these programs, you should first learn what they do by reading their manual descriptions). (3 points) (c) Copy these command binary files to your own directory; the copies will not be Set-UID programs. Run the copied programs, observe the results, and describe what you see. (3 points) 2. Task Two (15 points): Run Set-UID shell programs in Linux and describe and explain your observations.(a) Login as root, copy /bin/zsh to /tmp, and make it a set-root-uid program with permission 4755. Then login as a normal user and run /tmp/zsh. Will you get root privilege? (Please describe your observations).Note: If you cannot find /bin/zsh in your operating system, please use the following commands to install it: For Ubuntu: sudo apt-get install zshNote: in our pre-built Ubuntu VM image, zsh is already installed.(b) Instead of copying /bin/zsh, this time, copy /bin/bash to /tmp, and make it a set-root-uid program. Run /tmp/bash as a normal user. Will you get root privilege? (Please describe and explain your observations).Setup for remaining tasks: As you will find out from the previous task, /bin/bash has certain built-in protection that prevents the abuse of the Set-UID mechanism. To see what could be done prior to such a protection scheme was implemented, we are going to use a different shell program called /bin/zsh. In some Linux distributions (such as Fedora and Ubuntu), /bin/sh is a symbolic link to /bin/bash. To use zsh, we need to link /bin/sh to /bin/zsh.Execute the following commands to change the default shell to zsh:su (Enter root password) cd /bin rm sh ln -s zsh sh3. Task Three (20 points) The PATH environment variable The system(const char *cmd) library function can be used to execute a command within a program. The way system(cmd) works is by invoking the /bin/sh program, and then letting the shell program execute cmd. Because of the invoked shell program, calling system() within a Set-UID program is extremely dangerous. This is because the actual behavior of the shell program can be affected by environment variables, such as PATH; these environment variables are under user’s control. By changing these variables, malicious users can control the behavior of the Set-UID program.The program below is supposed to execute the /bin/ls command; however, the programmer only uses the relative path for the ls command, rather than the absolute path:#include int main() { system(“ls”); return 0; }(a) Can you run the above program with Set-UID (owned by root) instead of /bin/ls to list files? If you can, is your code running with the root privilege? Describe and explain your observations.(b) Now, change /bin/sh so it points back to /bin/bash and repeat the above attack. Can you get root privilege? Describe and explain your observations.4. Task Four (20 points) The difference between system() and execve(). Before you work on this task, please make sure that /bin/sh is pointed to /bin/zsh.Background: Bob works for an auditing agency, and he needs to investigate a company for a suspected fraud. For the investigation purposes, Bob needs to be able to read all the files in the company’s Unix system; on the other hand, to protect the integrity of the system, Bob should not be able to modify any file. To achieve this goal, Charlie, the sysadmin, wrote a special set-root-uid program (see below), and then gave execute permission to Bob. This program requires Bob to type a file name at the command line, and then it will run /bin/cat to display the specified file. Since the program is running as root, it can display any file Bob specifies. However, since the program has no write operations, Charlie is very sure that Bob cannot use this special program to modify any file.#include #include #include #include int main(int argc, char *argv[]) { char *v[3]; if(argc < 2) { printf(“Please type a file name. “); return 1; } v[0] = “/bin/cat”; v[1] = argv[1]; v[2] = 0;/* Set q = 0 for Question a, and q = 1 for Question b */ int q = 0; if(q == 0) { char *command = malloc(strlen(v[0]) + strlen(v[1]) + 2); sprintf(command, “%s %s”, v[0], v[1]); system(command); } else execve(v[0], v, 0); return 0; } a) Set q = 0 in the program. This way, the program will use system() to invoke the command. • Is this program safe? • If you were Bob, can you compromise the integrity of the system? For example, can you remove any file that is not writable by you?(Hint: remember that system() invokes /bin/sh, and then runs the command within the shell environment. We have tried the environment variable in the previous task; here let us try a different attack. Please pay attention to the special characters used in a normal shell environment).b) Set q = 1 in the program. This way, the program will use execve() to invoke the command. Do your attacks in task (a) still work? Please describe and explain your observations.5. Task Five (20 points) The LD PRELOAD environment variable To make sure Set-UID programs are safe from the manipulation of the LD PRELOAD environment variable, the runtime linker (ld.so) will ignore this environment variable if the program is a Set-UID root program, except for some conditions. We will figure out what these conditions are in this task.(step 1) Let us build a dynamic link library. Create the following program, and name it mylib.c. It basically overrides the sleep() function in libc:include void sleep (int s) { printf(“I am not sleeping! “); }(step 2) We can compile the above program using the following commands:gcc -fPIC -g -c mylib.c gcc -shared -Wl,-soname,libmylib.so.1 -o libmylib.so.1.0.1 mylib.o -lc(step 3) Now, set the LD PRELOAD environment variable using the following command:export LD_PRELOAD=./libmylib.so.1.0.1(step 4) Compile the following program (put this program in the same directory as libmylib.so.1.0.1):/* myprog.c */ int main() { sleep(1); return 0; }(step 5) Please run myprog under the following conditions and observe what happens. Based on your observations, describe when the runtime linker will ignore the LD PRELOAD environment variable, and explain why. • Make myprog a regular program and run it as a normal user. • Make myprog a Set-UID root program and run it as a normal user. • Make myprog a Set-UID root program and run it in the root account. • Make myprog a Set-UID user1 program (i.e., the owner is user1, which is another user account), and run it as a different user (not-root user).6. Task Six (15 points) Relinquishing privileges and cleanup To be more secure, Set-UID programs usually call setuid() system call to permanently relinquish their root privileges. However, sometimes, this is not enough. Compile the following program and make the program a set-root-uid program. Run it in a normal user account and describe what you have observed. Will the file /etc/zzz be modified? Please explain your observations.#include #include #include #include #include #include void main() { int fd; fd = open(“/etc/zzz”, O_RDWR | O_APPEND); /* Assume that /etc/zzz is an important system file, and it is owned by root with permission 0644 */ /* Simulate the tasks conducted by the program */sleep(1);/* After the task, the root privileges are no longer needed, it is time to relinquish the root privileges permanently. */setuid(getuid());/* getuid() returns the real uid*/ if (fork()) { /* In the parent process */ close (fd); exit(0); } else { /* in the child process */ /* Now, assume that the child process is compromised, malicious attackers have injected the following statements into this process */ write (fd, “Malicious Data”, 14); close (fd); } }Submission: You need to submit a detailed lab report describing what you have done for each task and what you have observed. You are also required to write explanations for your observations.

$25.00 View

[SOLVED] Cit593 module 12-dynamic memory and file i/o

The LC4 Disassembler: Dynamic Memory and File I/O InstructionsAssignment Overview ………………………………………………………………………………………………….. 3 Learning Objectives …………………………………………………………………………………………………….. 3 Advice ……………………………………………………………………………………………………………………….. 3 Getting Started ……………………………………………………………………………………………………………. 4 Codio Setup ………………………………………………………………………………………………………….. 4 Starter Code ………………………………………………………………………………………………………….. 4 Object File Format Refresher …………………………………………………………………………………… 5 Linked List Structure……………………………………………………………………………………………….. 5 Requirements …………………………………………………………………………………………………………….. 6 General Requirements ……………………………………………………………………………………………. 6 Disassembler ………………………………………………………………………………………………………… 6 lc4_memory.c: add_to_list ………………………………………………………………………………….. 6 lc4_memory.c: search_opcode ……………………………………………………………………………. 7 lc4_memory.c: search_address …………………………………………………………………………… 7 lc4_memory.c: print_list ……………………………………………………………………………………… 7 lc4_memory.c: delete_list …………………………………………………………………………………… 7 lc4.c: main ……………………………………………………………………………………………………….. 8 lc4_loader.c: open_file ……………………………………………………………………………………….. 8 lc4_loader.c: parse_file ………………………………………………………………………………………. 8 lc4_disassembler.c: reverse_assemble ………………………………………………………………… 8 Extra Credit …………………………………………………………………………………………………………… 9 Collaboration …………………………………………………………………………………………………………… 10 Collaboration Options ……………………………………………………………………………………………. 10 Collaboration Requirements …………………………………………………………………………………… 10 Collaboration Tips ………………………………………………………………………………………………… 11 Suggested Approach………………………………………………………………………………………………….. 13 High Level Overview …………………………………………………………………………………………….. 13 Great High Level Overview, but I really need a Slightly More Detailed Overview …………….. 14 Implement the LinkedList ………………………………………………………………………………….. 14 Setup the main Function …………………………………………………………………………………… 14 Implement the LC4 Loader ……………………………………………………………………………….. 15 Implement the Disassembler …………………………………………………………………………….. 17 Putting It All Together ………………………………………………………………………………………. 18 Testing …………………………………………………………………………………………………………………….. 19 Files for Testing ……………………………………………………………………………………………………. 19 Unit Testing …………………………………………………………………………………………………………. 19 GDB for Debugging ………………………………………………………………………………………………. 19 Segmentation Faults Demystified ………………………………………………………………………. 20 Valgrind for Memory Leaks and Memory Management Errors ……………………………………… 21 Submission ………………………………………………………………………………………………………………. 22 README file ……………………………………………………………………………………………………….. 22 Submission Check ……………………………………………………………………………………………….. 22 Consistency Checks ……………………………………………………………………………………………… 22 The Actual Submission ………………………………………………………………………………………….. 22 Grading …………………………………………………………………………………………………………………… 22 Disassembler ………………………………………………………………………………………………………. 23 Makefile …………………………………………………………………………………………………………. 23 Unit Tests ………………………………………………………………………………………………………. 23 Integration Tests ……………………………………………………………………………………………… 23 Valgrind deductions …………………………………………………………………………………………. 23 Extra Credit ……………………………………………………………………………………………………. 24 Flowchart …………………………………………………………………………………………………………………. 25 FAQ ………………………………………………………………………………………………………………………… 26 Quick Hints ………………………………………………………………………………………………………….. 26 Useful Commands ………………………………………………………………………………………………… 26 Programming Tips ………………………………………………………………………………………………… 26 Endianness …………………………………………………………………………………………………………. 27 Resources ………………………………………………………………………………………………………………… 27Assignment Overview In the last assignment, you created a .obj object file from a .asm assembly file. In this assignment, you will write a program that opens and reads a .obj file created by PennSim, parses it, and loads it into a linked list representing the LC4’s program and data memories (similar to what PennSim’s “loader” does). Additionally, you will be able to convert the binary file contents back to the assembly it came from! This is known as reverse assembling or disassembling.Learning Objectives This assignment will cover the following topics: ● Review the LC4 Object File Format ● Implement a LinkedList in C ● Read and process binary files ● Disassemble binary data into a human-readable format ● Use debugging tools such as GDB and Valgrind ● (Optionally) Collaborate with another programmer on a complex projectAdvice ● Read this entire document before starting. ● Read this entire document before starting. ● Start on this project early. ● Find a collaboration partner, even if you decide to do the assignment solo. ● Attend recitation, or watch the recordings.Getting Started Codio Setup Open the Codio assignment via Canvas. This is necessary to link the two systems. You will see many files; the directory “obj files for student testing” contains a selection of sample object files you can use for testing your program. Do not assume that we will use these exact files for grading (we won’t). You will need to move these to the root directory in order to test with them.Starter Code We have provided a basic framework and several function definitions that you must implement. lc4.c – must contain your main function. lc4_memory.c – must contain your linked list helper functions. lc4_memory.h – must contain the declaration of your row_of_memory structure – must contain the declarations of your linked list helper functions lc4_loader.h – must contain your loader function declarations lc4_loader.c – must contain your .obj parsing function lc4_disassembler.h – must contain your disassembler function declarations lc4_disassembler.c – must contain your disassembling function Makefile – must contain the targets: lc4_memory.o lc4_loader.o lc4_disassembler.o lc4 all, clean, and clobberObject File Format Refresher The following is the format for the binary .obj files created by PennSim from your .asm files. It represents the contents of memory (both program and data) for your assembled LC-4 Assembly programs. In a .obj file, there are 3 basic sections indicated by 3 header “types” = Code , Data, and Symbol: ● Code: 3-word header (xCADE, , ), n-word body comprising the instructions. ○ This corresponds to the .CODE directive in assembly.● Data: 3-word header (xDADA, , ), n-word body comprising the initial data values. ○ This corresponds to the .DATA directive in assembly.● Symbol: 3-word header (xC3B7, , ), n-character body comprising the symbol string. These are generated when you create labels (such as “END”) in assembly. Each symbol is its own section. ○ Each character in the file is 1 byte, not 2 bytes. ○ There is no NULL terminator.Linked List Structure In the file lc4_memory.h, you’ll see the following structure defined:Because you will not know the number of instructions in advance, you will create a Linked List of row_of_memory nodes, each node representing a single row of memory.Requirements General Requirements ● You MUST NOT change the filenames of any file provided to you in the starter code. ● You MUST NOT change the function or struct declarations of any function or struct provided to you in the starter code. ○ You MUST NOT to not add any additional .c source or .h header files ● Your program MUST compile when running the command make. ● You MUST NOT have any compile-time errors or warnings. ● You MUST test your code with Valgrind before submission. Valgrind MUST report 0 errors and 0 memory leaks. See the details in the Valgrind section for required details. ● You MUST remove or comment out all debugging or error message print statements before submitting. ● You MUST follow the requirements in the Collaboration section, even if working alone. ● You MUST NOT use externs or global variables. ● Your program MUST be able to handle .obj files produced by PennSim. ● You SHOULD comment your code since this is a programming best practice. ● You MUST follow the individual requirements for the functions (below).Disassembler You MUST follow the requirements in the source files provided as starter code. lc4_memory.c: add_to_list This function adds a new row_of_memory node to the LinkedList. ● If a node with the specified address already exists in the LinkedList, this function MUST update the contents field and take no other action ● Otherwise, this function MUST ○ allocate space for a new node, ○ set the address and contents fields based on the function arguments, ○ set the label and assembly fields to NULL, ○ not allocate memory for the label or assembly fields ● If the head pointer is NULL, this function MUST set the newly created node as the head of the LinkedList. ● Otherwise, this function MUST insert the newly created node into the LinkedList based on the address field in ascending order ● This function MUST return 0 for success, and SHOULD -1 if malloc failslc4_memory.c: search_opcode This function searches the LinkedList until it finds a node where the opcode field matches the opcode argument AND the assembly field is NULL. ● For this function, the opcode to check for is the four least significant bits of the opcode argument and ranges from 0 to 15 inclusive. ● This function MUST traverse the LinkedList starting from the head. ● If it finds a node where the opcode to check for matches the four most significant bits of the opcode field AND the assembly field is NULL, then it MUST return a pointer to this node. ● This function MUST return NULL if no matching node is found in the LinkedList. ● This function MUST return NULL if the LinkedList is empty. lc4_memory.c: search_address This function searches the LinkedList until it finds a node where the address field matches the address argument. ● This function MUST traverse the LinkedList starting from the head. ● If it finds a node where the address to check for matches the address field, then it MUST return a pointer to this node. ● This function MUST return NULL if no matching node is found in the LinkedList. ● This function MUST return NULL if the LinkedList is empty. lc4_memory.c: print_list This function prints the LinkedList in a specific format. ● If the head pointer is NULL, this function MUST take no action ● This function MUST print a column title; see the Putting It All Together section for an example header. ● This function MUST print a single line for each node. ● If attempting the extra credit, it MUST NOT print the assembly field for nodes where the opcode of the contents is not 0001. ● It MUST print the address and contents fields in hexadecimal with leading zeroes (4 characters wide). ● It MUST print ONLY the memory list. ● The registers in assembly instructions SHOULD be separated by a comma ○ e.g. ADD R1, R2, R3 ● If the contents of a node are 0, it MUST print them as 0 or 0000. ● If the assembly of a node is NULL, it MUST print the assembly as (null) or leave it blank ● It MUST print the label when one exists, otherwise it MUST leave the section blank.lc4_memory.c: delete_list This function deletes the LinkedList node by node. ● This function MUST correctly free all allocated memory for each node ● This function MUST set the head pointer to NULL upon deletionlc4.c: main The main function MUST follow the steps outlined in the starter code. ● It MUST hold the row_of_memory* memory (do not modify this line). ● It MUST NOT call malloc at any point.lc4_loader.c: open_file This function opens a file for reading. ● It MUST attempt to open the file file_name. ○ If the file exists, it MUST open the file and return a FILE* to the opened file. ○ Otherwise, it MUST return NULL. ● You MUST NOT attempt to append .obj to the provided file_namelc4_loader.c: parse_file This function parses a LC4 .obj file. ● It MUST correctly handle endianness. That is, it MUST adjust for reading 16-bit words from the file. It MUST handle the .obj files that PennSim produces. ● For each 3-word header in the file, it MUST ○ read the 3-word header, ○ parse the , , and words and correctly determine the type of header and memory to allocate, ○ read the remaining words or bytes, ○ create a new row_of_memory node, ○ add the new node to the LinkedList (using the lc4_memory functions) ● After reading the file, it MUST ○ close the file ○ return 0lc4_disassembler.c: reverse_assemble This function disassembles each node to generate a value for the assembly field. ● It MUST search every node in the LinkedList. ● If the node’s address is in a CODE region, then it MUST do the following: ○ If the opcode is 0001, it MUST ■ translate the contents field into the human-readable instruction mnemonic, ■ allocate space to hold the instruction string, and ■ store the string into the assembly field ● If the node’s address is not in a CODE region, then it MUST NOT attempt disassembly ● It MUST provide an actual label when translating a Branch or Jump instruction into assembly language, not an immediate value. So JMP END, NOT JUMP #18. ○ Note that this only applies for extra credit attempts. ● After checking all nodes, it MUST return 0.Extra Credit For optional extra credit, build the complete LC4 Disassembler. Your program MUST fulfill these additional requirements, while still fulfilling all the requirements for the rest of the assignment: ● finish the disassembler to translate all instructions in the ISA. ● create a new output file .asm, where is the name of the object file without the extension (i.e. create a new file with the .obj extension) ● write the equivalent LC4 assembly of the object file to this new output file ● PennSim MUST be able to assemble this file ● PennSim MUST be able to load this file into memory ● PennSim MUST show that the loaded contents your assembled file and the original object file are equivalentCollaboration You are not required to have a teammate. Points will not be deducted if you choose to work alone. If you would like to work as a team, you MUST only have one teammate. Groups of 3 or more are not permitted. Collaboration Options You have three Options for collaboration: 1. No collaboration Complete the assignment independently. You must include a README file that states you worked completely alone.2. Collaboration and submit the same code After each teammate completes their LinkedList functions, you are free to split up the remainder of the work as you see fit. You can work on everything together or each team member can work on part of the project. Both partners turn in the same code and will receive the same grade. You must include a README file indicating you are submitting the same code, identifying both students, and detailing how you chose to divide up the work (e.g. who contributed what for each function).3. Collaboration and submit different code Collaboration Requirements If you choose to work as a team, each team member MUST follow these additional assignment requirements: 1. Each team member must complete all 5 functions supporting the LinkedList data structure independently.You will likely find that you need to make changes to your lc4_memory.c source file as a result of tests conducted by you and your partner. Implementing the LinkedList functions on your own is an invaluable learning experience, but once you have completed the functions, you are free to discuss them with your partner and refine the final version for submission.3. Each function MUST include a comment at the top with the name of the person who authored it. If there is anything more you think we should know about your submission, it should be included in your README file.4. Each team must submit a brief README file that describes which Collaboration Option you chose and your division of labor. If you are doing Option 1, you MUST indicate that you worked alone. If you are doing Option 2, you MUST include both your and your teammate’s names and specify who wrote each function. If you are doing Option 3, you MUST to include both your and your teammate’s name and a brief (a few sentences max) description of how you worked together.The README should not be more than one page in length.We strongly recommend that you and your teammate work together to test, debug, and fix memory leaks in your code. Collaboration Tips Here are some suggestions to discuss with your partner before starting. These are ideas that other students have found helpful but they are not required. ● Discuss your preferred communication platforms. Do you like to use Slack? Email? Text messages? Regular video calls on Slack or Zoom? Pick something you will be able to check daily to help your collaborator with debugging challenges as they arise.● Most people appreciate updates about when you plan to work on the assignment. Letting your teammate know that you plan to do most of your work over the weekend, for example, demonstrates a commitment to the work and helps set expectations for people with different working schedules. If you prefer to leave work until the last minute, your teammate deserves to know this (we recommend starting as early as possible).● Assign roles as soon as possible. It should be clear who is responsible for writing which sections of the code before you start the project. Consider writing the README first and updating it if necessary.● Review the assignment instructions (again). Consider which parts you think will be the most challenging and which parts you feel the most confident in. Try to split up the work so that each person has at least one challenging section and one section they feel confident in. This allows each person to contribute their individual skills while also having opportunities to learn new things.● Plan to check in multiple times. This is a large project and it won’t be done in a few hours and probably not in a few days.● Plan at least one day to debug and fix memory leaks before turning in the assignment. While you will be primarily graded on functionality, it is important that you learn to use malloc and free correctly and that you learn to use Valgrind to find and eliminate any memory leaks. Code that contains memory leaks will not receive full credit.● Whenever possible, try to explain your work to your teammate in your own words. Explaining your code to someone else is a great learning tool for both people!● Back up your work to Codio frequently.● Check in with your teammate before making changes to code your teammate has written.● Communicate when asking for an extension.Suggested Approach This is a suggested approach. You are not required to follow this approach as long as you follow all of the other requirements. High Level Overview Follow these high-level steps and debug thoroughly before moving on to the next. 1. Create a pointer-based framework in C to hold a LinkedList by writing the following functions: a. A function to create a new node in the LinkedList. If this is the first node in the list, this function will create a new list. If there is already an identical node in the list, this function will update the contents of that pre-existing node.b. A function to search the LinkedList for a node with a specific memory address value.d. A function to print the elements of the list in the specified format.e. A function to delete the entire list and free the memory it was using.2. Write the open_file function to open a .OBJ file specified by the user via the command line.3. Write the parse_file function to extract information from the open file, place the information into your LinkedList, and close the file.4. Write the reverse_assemble function to update each node in your LinkedList with the assembly language equivalent of the binary strings extracted by parse_file.5. Print your LinkedList.6. Debug and resolve any lingering memory leaks and other memory management errors.Great High Level Overview, but I really need a Slightly More Detailed Overview Okay, I guess we can give some more details. Implement the LinkedList The first thing to do is to get the LinkedList working, that is, create the list, place new nodes into the correct position, etc. The first files to view in the helper file are lc4_memory.h and lc4_memory.c. In these files you will notice the structure that represents a row_of_memory as referenced in the LinkedList section. You will also see several helper functions that will serve to manage a LinkedList of rows_of_memory nodes. Your job is to implement these LinkedList helper functions using your knowledge from the last assignment. You must implement everything described by the comments in the starter code If you wish to implement additional helper functions, feel free to add them to any .c source file but remember to add the function prototypes to the appropriate .h header file.Setup the main Function Accept arguments and pass them to the functions. Set up the general flow of high-level functions. Switch to modifying the file called lc4.c. This serves as the main function for the entire program. The head of the linked list must be stored in main. Notice that a pointer named memory will do just that. main then extracts the name of the .obj file the user has passed in when they ran your program (this is in the argv[] parameter). Next, it calls lc4_loader.c’s open_file function and holds a pointer to the open file. Then, it calls lc4_loader.c’s parse_file function to read and interpret the .obj file. Lastly, it disassembles the file, prints the LinkedList to the terminal, deletes the LinkedList, and finally terminates the program. All of these functions are described in greater detail in later subsections.We have provided the order of the function calls and their purpose as shown in comments in the lc4.c. Once you have properly implemented lc4.c and have it accept input from the command line, a user should be able to run your program as follows:./lc4 .obj where can be replaced with any file name the user desires as long as it is a valid .obj file that was created by PennSim. If no file is passed in, your program should generate an error telling the user what went wrong, like this:error1: usage: ./lc4 .objImplement the LC4 Loader The loader is responsible for reading a file, parsing each line, and creating/modifying nodes. Most of the work of your program will take place in the file: called lc4_loader.c. In this file, start by implementing the function open_file to take in the name of the file the user of your program has specified on the command line (see lc4_loader.h for the definition of open_file). If the file exists, the function should return a handle to that open file, otherwise a NULL should be returned. As shown in the Flowchart, have the function read in the 3-word header from the file. You’ll notice that all of the LC4 .obj file headers consist of 3 fields: , , and . As you read in the first header in the file, store the address and the field into local variables. Then determine the type of header you have read in: CODE, DATA, or SYMBOL. The CODE header The body of the CODE section is -words long. This is a sample CODE section: CA DE 00 00 00 0C 90 00 D1 40 92 00 94 0A 25 00 0C 0C 66 00 48 01 72 00 10 21 14 BF 0F F8Notice the field for is 0x000C, or decimal 12. Because each instruction in LC4 is 1 word long, this indicates that the next 12 words in the .obj file are 12 LC4 instructions. The first LC4 instruction in the 12-word body is: 0x9000 which is a CONST assembly instruction if you convert to binary. Allocate memory for a new node in your linked list to correspond to the first instruction. As it is the first instruction in the body, and the address has been listed as 0x0000, you would populate the row_of_memory structure as follows:address 0000 label NULL contents 9000 assembly NULL next NULL In a loop, read in the remaining instructions from the .obj file; allocate memory for a corresponding row_of_memory node for each instruction. As you create each row_of_memory, add these nodes to your linked list, ordering the list by address (you should use the functions you’ve created in lc4_memory.c to help you with this). For the first 3 instructions listed in the sample above, your linked list would look like this:The DATA Header The procedure for reading in the DATA sections is identical to reading in the CODE sections. These would become part of the same linked list since PROGRAM and DATA are all in one memory on the LC4; they are partitioned by address.The SYMBOL Header For the following SYMBOL header/body:C3 B7 00 00 00 04 49 4E 49 54The address field is: 0x0000. The symbol field itself is 0x0004 bytes long. The next 4 bytes 0x49 0x4E 0x49 0x54 are ASCII for INIT. This means that the label for address: 0000 is INIT. Your program must search the LinkedList nodes, find the appropriate address that this label is referring to, and populate the label field for the node. Note: tells us exactly how much memory to malloc to hold the string, however, you must add a byte to hold the NULL terminator. For INIT, this means you need to allocate 5 bytes. For the example above, the node 0000 in your LinkedList, would be updated as follows:address 0000 label INIT contents 9000 assembly NULL nextIt is possible that an address has two labels in the .obj file. In this case, use the last one that appears in the .obj file. Once you have read the entire file, created and added the corresponding nodes to your LinkedList by address order, close the file and return to main. If you encounter an error in closing the file, print an error, free all the memory associated with the LinkedList, and then exit the program.Implement the Disassembler Go through the row of memory nodes and update the assembly field based on the contents. In lc4_disassembler.c, write a third function reverse_assemble that will take as input memory the head of the LinkedList populated in the previous section. reverse_assemble must translate the hexadecimal representation of selected instructions into their assembly equivalent. Refer to the LC4 ISA Instruction document for details. To simplify this problem a little, do not translate every single instruction into its assembly equivalent. Only translate instructions with the opcode of 0001, that is, ADD REG, MUL, SUB, DIV, and ADD IMM. The immediate value MUST be prefixed with #, x, or X, (as appropriate), for example ADD R1, R1, #10 == ADD R1, R1, xF == ADD R1, R1, XF Do not translate data stored at an address in the DATA section. As shown in the flowchart, this function will call your linked list’s search_opcode helper function. Your search_opcode function should take as input a 4-bit value representing the opcode to search for and return the first node in the LinkedList that matches the opcode and also has a NULL assembly field. For example, here’s the definition of an ADD instruction from the ISA: 0001 ddds ss00 0ttt When searching for an opcode instruction with opcode == 0001, the opcode parameter passed to search_opcode must be: 0000 0000 0000 0001 So you will need to use the C-bitwise operators to line these two values up before comparing them. search_opcode finds the first instruction in the LinkedList where these two 4-bit fields match with that additional constraint that rows_of_memory that already have the assembly instruction filled in doesn’t count as a match. When/if a node in your linked list is returned, you’ll need to examine the contents field of the node and translate the instruction into its assembly equivalent. Once you have done that, allocate memory for the ASCII string and store this string in the assembly field of the node. Repeat this process until all the nodes in the LinkedList with an opcode == 0001 have their assembly fields properly translated. As an example, the figure below shows a node on your list that has been “found” and returned when the search_opcode function was called.From the contents field, we can see that the hexadecimal value 0x128B which is 0001 001 010 001 011 in binary. From the ISA, we realize the sub-opcode reveals that this is actually a MUL instruction. We can then generate the string “MUL R1, R2, R3” and store it back in the node in the assembly field. For this work, we strongly encourage you to investigate the switch statement in C (any good book on C will help you understand how this works and why it is more practical than multiple if/else/else/else statements).Putting It All Together One last thing to do in main is to call a function to print the contents of your LinkedList to the screen. So call the print_list function in lc4_memory.c. You will need to implement the printing helper function to display the contents of your lc4’s memory list like this: INIT 0000 9000 0001 D140 0002 9200 … 0009 128B MUL R1, R2, R3 END 000A 0 (and so on…)Testing Files for Testing In the last assignment, you created a .obj file. Try loading that file into the Codio workspace for this assignment and use your lc4 program on it. You know exactly how that program should disassemble. To test further, bring up PennSim, write a simple program in it, output a .obj from PennSim, then read into your program and see if you can disassemble it. You can create a bunch of test cases very easily with PennSim. You should test your lc4 program on a variety of .obj files, not just simple examples. We have provided a selection of .obj files in the “obj files for student testing” folder in the Codio workspace. Additionally, this directory contains .sol solution file which are the expected output. You can compare the expected output to your output to see if you are getting the expected results. From the default home workspace directory (when the terminal prompts ends with ~, you will need to use cd submit to change the directory to the submit directory.Unit Testing When writing such a large program, it is a good strategy to “unit test.” This means, as you create a small bit of working code, compile it, and create a simple test for it. As an example, once you create your very first function: add_to_list, write a simple main and test it out. Call it, print out your “test” list, and see if this function even works. Run valgrind on the code, see if it leaks memory or accesses uninitialized memory locations. Once you are certain it works, and doesn’t leak memory, go on to the next function search_address, implement that, and test it out. DO NOT write the entire program, compile it, and then start testing it. You will never resolve all of your errors this way. You need to unit test your program as you go along or it will be impossible to debug.GDB for Debugging gdb allows you to inspect the actual contents of memory which is an advantage over print statements because print statements only print ASCII characters. Further, you can see the actual contents of memory of any variable at any time, while print statements only print when you call the print statement during the execution of your program. Segmentation Faults Demystified Segmentation faults are VERY common failures in C programs. They can be hard to pin down. First we should understand why they happen: 1. Case 1: trying to dereference a NULL pointer. This often happens when calling a function that returns a pointer or NULL in case of error. If you don’t check the return value from the function and then proceed to dereference the pointer, you will get a segmentation fault.2. Case 2: an incorrect assumption that memory on the stack (and heap) is initialized to zero. This is NOT the case. If you want the memory to be initialized to zero, you need to do this explicitly, possibly using the memset function. You might have code that checks to see if the memory is 0 or NULL and then take some action based on this. If the memory is not initialized to zero, it will have random, unpredictable contents. This is often referred to as (X) or “don’t care” in the lectures.So, how do you figure out where the segmentation fault is occurring? The simplest way to find out is using GDB. After compiling using the -g option and running the program, run the program with arguments (the gdb commands start and run allow you to specify the command line arguments), and you should go right to the segmentation fault. You can then use the gdb where command which will tell you the line number in your program where the failure occurs.For example: gdb -q -tui –args ./lc4 test1.obj (gdb) runruns gdb on your lc4 program with argv[1] set to test1.objValgrind for Memory Leaks and Memory Management Errors Prior to exiting your program, you MUST properly free any memory that you allocated. We will be using a memory-checking program known as valgrind to ensure your code properly releases all memory allocated on the heap. Simply run your program, lc4, as follows: valgrind –leak-check=full –track-origins=yes ./lc4 test1.obj where valgrind is the name of the program to run, –leak-check=full is a Valgrind option to perform the full memory leak analysis, –track-origins=yes is a Valgrind option to show you where some errors originate from, ./lc4 is the name of the program that Valgrind will analyze, and test1.obj is the argument to your lc4 program, i.e. the name of the .obj file you want to disassemble.● Valgrind will find errors related to accessing uninitialized memory locations (invalid read/write errors). This typically results from assuming that malloc zeroes out the memory it returns. malloc DOES NOT do this.Submission README fileSubmission Check There is a single “submission check” test that runs once you upload your code to Gradescope. This test checks that you have submitted all required files and also that your program compiles and any autograder code compiles successfully. It does not run your program or provide any feedback on whether it works or not. This check just ensures that all the required components exist. This test is performed after uploading to Gradescope.Consistency ChecksThe Actual Submission You will submit this assignment to Gradescope in the assignment entitled Assignment 12: The LC4 Disassembler. Download the required .c source and .h header files (as well as any additional helper files required) as well as your Makefile and README from Codio to your computer, then upload all of these files to the Gradescope assignment. Do not not submit intermediate files (anything .o). We will only grade the last submission uploaded. If you are working with a partner, you MUST also click the “View or edit group” link in the upper right of the submission page to add your partner. We will adjust these based on the README content, but give your partner peace of mind by including them at submission time. Do not mark your Codio workspace complete. Only the submission in Gradescope will be used for grading purposes.Grading We will only grade the last submission, regardless of the results of any previous submission. We will not be providing partial credit for autograder tests. Disassembler We will only use valid, disassemblable .obj files. We will not test your program with deliberately faulty .obj files. Makefile 05 points As part of the submission check, the autograder will ensure that you have submitted all the required files and that makefile correctly creates the final executable. Unit Tests 50 points The autograder will test your open_file, add_to_list, delete_list, search_address, search_opcode, and reverse_assemble functions by providing inputs directly to these functions. The autograder will deduct points if they do not produce the correct output. Integration Tests 45 points Valgrind deductionsExtra Credit The Extra Credit is worth 15 percentage points so the highest grade on the assignment is 115%. Your extra credit MUST NOT break functionality for the non-extra credit requirements. Make a backup of your finalized program before attempting the extra credit. If your program fails to meet the basic requirements, you will end up losing more points than the extra credit would gain. There is no partial credit. It must work completely for any credit. We will not give guidance on how to do this since it is designed to be capstone challenge problem.FlowchartFAQ Quick Hints ● You can assume that the maximum length of an assembly instruction is 100, the maximum length of a label is 70 and the maximum length of a file_name is 100. ● Make sure all of your loops that traverse the memory list look at the first and last element of the memory list. This is a VERY common mistake. ● You do not have to check that the addresses in the .obj file are valid. That is, addresses for CODE memory will always be after a .CODE directive, and addresses for DATA memory will always be after a .DATA directive. Essentially, all test files will be valid LC4 object files. Useful Commands ● The hexdump -C command displays an .obj file one byte at a time rather than one word at a time. The first column displayed by hexdump is the byte offset. Programming Tips ● There are many possible errors and we will not check them. But you should as part of debugging and ensuring your program does have the correct output. Some example errors: ○ the input file isn’t validly formatted ○ malloc can’t find sufficient memory ○ etc It is a best practice to print an error message and exit if any of those things happen but the autograder will not be testing those sort of edge cases. ● While we want your program to have no memory leaks, it is more important that your program actually runs. Get the program working, then go back and fix memory leaks. ● You must allocate memory for strings before calling strcpy. ● add_to_list must keep the memory list in sorted order by address. CODE vs. DATA isn’t relevant. will be FALSE. This means that you need to check all of your calls to fread or fgetc to make sure they didn’t hit the end of the file. Endianness ● The x86 (the processor used by Codio) has a different endianness than the LC4. When doing fread’s of 2 byte words, swapping occurs to adjust for this. That same swapping doesn’t occur with the fgetc or fread’s with size 1. ● If you read the .obj file into memory one word at a time using fread, you will need to swap for endianness. In contrast, if you choose to read the .obj file into memory one byte at a time with fgetc, the endianness doesn’t need to be adjusted. However, you will have to combine two bytes into a word using bitwise operators.Resources ● Valgrind documentation https://www.valgrind.org/docs/manual/quick-start.html#quick-start.interpret ● Checking end of file https://faq.cprogramming.com/cgibin/smartfaq.cgi?id=1043284351&answer=1046476070 ● C Bitwise Operators in Canvas [no link since this changes every semester]

$25.00 View

[SOLVED] Cit593 module 10 -c strings

C Strings Table of Contents Assignment Overview 3 Learning Objectives 3 Advice 3 Getting Started 4 Codio Setup 4 Starter Code 4 my_string.h 4 my_string.c 4 program1.c 4 Background – Introduction and my_strlen 5 Problem 1 – Creating Your Own Library of String Functions 6 Overview 6 Requirements 6 Hints 7 Problem 2 – Adding Non-Standard String Functions to Your Library 8 Overview 8 Requirements 9 Problem 3 – Parsing Strings 11 Overview 11 The sscan and sprintf functions 11 Arguments to main 11 Problem 3 Task 12 Requirements 12 Problem 4 (Extra Credit) – my_strtok 14 Overview 14 Requirements 14 A Hint 14 Testing Your Functions 15 Submission 16 Submission Checks 16 Consistency Checks 16 The Actual Submission 16 Grading 17 Main Assignment 17 Extra Credit 17 Hints or FAQs 18 Resources 19Assignment Overview The goal of this assignment is to give you experience with strings in C and with the library of functions designed to work with strings, give you an appreciation of how those functions work, and also continue to help you work with pointers and arrays (in the context of C Strings).Learning Objectives ● Implement a library of functions based on requirements ● Program with pointers using array notation and pointer arithmetic ● Practice with C Strings ● Make working makefilesAdvice ● Test with many examples, not just one ● Start early and ask questions early. Do not wait until the last minute to do this assignment! ● Remember that C Strings require a null terminator (”) ● Don’t return pointers to stack arrays from functions. The compiler will probably warn you about this too. Always compile with the -Wall option.Getting Started Codio Setup ● Open the Codio assignment via Canvas. This is necessary to link the two systems. ● We have provided three starter code files. ● We have not provided a makefile. Part of the assignment is for you to do this yourself. You can refer to the previous assignment for an example about building a C project that includes multiple .c files.Starter Code We have provided a basic framework and several function definitions that you must implement. my_string.h This file contains the function declarations you must implement. Aside from adding the required declarations for my_strrev, my_strccase, and optionally my_strtok, do not modify this file.my_string.c This file contains empty implementations for the functions defined in my_string.h. We have provided the implementations of my_strlen using array notation and pointer arithmetic for you. You will provide the remaining implementations in this file.program1.c This is a test environment program only. We will not review it or even look at it and it will not be used for grading. You are free to write any code necessary to test your implementations.Background – Introduction and my_strlen char my_string [100] = “Tom” ; strlen(my_string) would return 3. Even though there are 100 bytes allocated on the stack for the string, since there are only 3 characters (followed by a NULL), the length of the string is indeed 3. my_strlen_array treats the incoming argument (char* string) as if it is an array using array notation (i.e. with square brackets [ and ]) size_t my_strlen_array(const char *str) int len = 0 ; while (str[len] != ”) { len++ ; } return (len) ; }my_strlen_pointer treats the incoming argument as the pointer it truly is, using pointer arithmetic to determine the string’s length. size_t my_strlen_pointer(const char *str) const char* s; for (s = str; *s; ++s) ; return (s – str); } Note: size_t is not an actual C type; is a “typedef’ed”, that is, it is a shortcut for unsigned long.Your task for this assignment is to implement your own library of string functions to mimic the standard C library string functions. In Codio, we have provided a header file called my_string.h. In that header file, we have declared several functions: my_strlen_array, my_strcpy_pointer, etc. In my_string.c, we have implemented only two of the many functions: my_strlen_array and my_strlen_pointer, as described above. You will implement the remaining functions. In a third file: program1.c, we have provided some basic code that calls the functions in your my_string library and compares the output of them to functions in the standard C-library string.h. This is one way to quickly check if your output is correct. Look carefully at these three files before continuing.Problem 1 – Creating Your Own Library of String Functions For this problem, your task is to implement your own library of string functions to mimic the standard C library string functions.Overview ● Your job is to complete the implementation of my_string.c and test the functionality in program1.c. ● Notice that you must make two versions of the same function, one using array notation and the other using pointer notation. ○ As an example, my_strlen_array uses array notation to calculate the length of the string, while my_strlen_pointer uses a much more efficient implementation using pointers to do the same thing. This is your chance to truly play with pointers and see if you can come up with more efficient techniques to solve problems than were done using array notation. ● Since many of these functions will be new for you, we have included helpful links in the Resources section so you can learn to use each of these functions before you try to implement them! ○ The main concept you need to understand is that, if a function declares a variable as const, that means the function promises to not modify that variable.Requirements ● You MUST NOT modify my_string.h in any way, except to add the required function declarations my_strrev, my_strccase, and (optionally) my_strtok. ● You MUST NOT use the standard library string.h in your implementations. ● You MUST complete the implementations of the my_string library in my_string.c. ○ You MUST implement my_strcpy_array using array notation and my_strcpy_pointer using pointer notation. ○ You MUST implement my_strchr_array using array notation and my_strchr_pointer using pointer notation. ○ You MUST implement my_strcat_array using array notation and my_strcat_pointer using pointer notation. ○ You MUST implement my_strcmp_array using array notation and my_strcmp_pointer using pointer notation. ● You MUST create a makefile called makefile with the following targets: ○ my_string.o ○ program1 ○ all ○ clean, which MUST remove all .o files only ○ You MUST NOT use a _array function inside a _pointer function ○ You MUST NOT use a _pointer function inside a _array function ● You MUST NOT have “debug print statements” in any implementation. If you choose to use print statements inside your functions, you MUST comment them out or remove them before submission. We discourage print statements in general and suggest learning gdb as your debugging tool. ● Your implementations MUST function the same as the standard library functions. ○ Part of the assignment is to explore these functions and see how they work with different kinds of inputs. This means you will need to think critically about edge cases, see how the standard library handles them, and implement the same functionality on your own. Therefore, we will not provide a set of requirements for these functions. ○ You MUST return a NULL pointer in other scenarios, which you will need to discover by investigating the string.h standard library functions. ● You SHOULD comment your code since this is a programming best practice.Hints ● For my_strcmp, the return value just needs to be the correct sign (-, 0, or +), not the actual char difference. The specific magnitude of the negative or positive number returned does not matter.Problem 2 – Adding Non-Standard String Functions to Your LibraryOverview ● In the previous section you implemented and tested functions that mimic those in the standard C-library. Now you’ll add two functions that don’t exist in the C-library, but will exist in your own my_string library! ● my_strrev ○ This function takes a single string argument, reverses the contents of the string that is passed in, and returns a pointer to the resulting string. ○ As an example: char my_string [] = “Tom” char* ptr = my_strrev(my_string) ; After my_strrev returns, my_string contains “moT”, and ptr points to the first element in my_string. ● my_strccase ○ This function takes a single string argument, converts each character of the string to the opposite case, and returns a pointer to the resulting string. ○ As an example: char my_string [] = “Tom” char* ptr = my_strccase(my_string) ; After my_strccase returns, my_string contains “tOM” and ptr, points to the first element in my_string. ○ As a hint, examine the ASCII table, you will see that if you work on the hexadecimal characters directly, you can very easily convert them to their upper or lowercase equivalents! ● You do not need to make two versions of these functions (pointers and array). You can implement them anyway you see fit. You will need to modify my_string.h (adding the proper declaration statement for each of these new functions) and my_string.c (adding the proper definition of each of these new functions). And you will need to create program2.c to properly test these new function’s you’ve created. ● Create a new program called program2.c to test these new functions. ● Be sure to add a “program2” directive to your existing Makefile.Requirements ● You MUST call the two functions my_strrev and my_strccase. ● my_strrev MUST ○ take a single char* argument (the string to reverse) ○ reverse the characters “in place” ■ It MUST NOT return a new pointer but rather modify the contents of the original string ● my_strccase MUST ○ be called my_strccase (yes, there are two cs, for “change case”) ○ take a single char* argument (the string to change cases) ○ change the case of the characters “in place” (lowercase becomes uppercase and uppercase becomes lowercase) ■ It MUST NOT return a new pointer but rather modify the contents of the original string ■ It MUST NOT use ctype.h. We are explicitly forbidding this header since it trivializes the assignment. ○ return a pointer that points to the start of the modified string. ○ take no action on non-letter characters (e.g. numbers or symbols). ● You MUST add function declarations for the two functions to my_string.h. ○ You MUST NOT modify my_string.h in any other way. ● You MUST NOT use the standard library string.h in your implementations. ● You MUST complete the implementations of the two functions in my_string.c. ○ You MUST implement the functions using either array notation or pointer notation. You are free to choose either method for these functions. ● You MUST update your makefile for the following targets: ○ program2 ○ all ○ clean ○ clobber ● You MUST NOT have “debug print statements” in any implementation. If you choose to use print statements inside your functions, you must comment them out or remove them before submission. We discourage print statements in general and suggest learning gdb as your debugging tool.● You SHOULD comment your code since this is a programming best practice.Problem 3 – Parsing StringsOverview The sscan and sprintf functions ● sscanf and sprintf and are two very useful functions related to scanf and printf frequently used to parse strings and convert data from strings into different data types. ● sscanf works identically to scanf except, instead of reading the keyboard for input, it reads a C String as its input ● sprintf works identically to printf except, instead of using the ASCII display for output, it uses a C String as its output. ● Since you already know how to use scanf and printf, this problem will feel familiar. The two links in the Resources section provide a reference for their arguments/returns and basic function. ○ DO NOT use your version of my_strcat; use the official library strcat in string.h for the problem. ○ You should be familiar with this function by now, but we have a reference link for it in the Resources section. Arguments to main ● Recall from the lectures on the stack, that the function main always has a blank spot for arguments, but our declaration of main never has any arguments. Well, it’s actually possible for main to take arguments. BUT, they can only be specified as follows: int main (int argc, char** argv) ; ○ The first argument: argc (argument count) contains the number of arguments passed to main. ○ The second argument: argv (argument vector) is actually a pointer to an array of strings. Each element of the array is a single char* for one of the arguments. ● How exactly do you pass arguments to main? You do it when you start your program in the terminal. ● Take the following code and put it in a file called program3.c int main (int argc, char** argv) { printf (“# of arguments passed: %d “, argc) ; for (int i=0; i< argc ; i++) { printf ( “argv[%d] = %s “, i, argv[i] ) ; } return (0) ; } ● Compile it, and then run it with this command: ./program3 arg1 2 arg3 4 arg5 ● Watch the output and look at the code above to see how it works! Try it with different arguments and watch how things change. Problem 3 Task ● Notice that all the arguments passed in are treated as C Strings in your program. Even though 2 is a number, it is treated as a NULL terminated character array (a C String) inside the argv array. ● You will be adding additional code to program3.c to ○ read each char* argument in the argv array, ○ determine if it is an integer or a string, ○ store the integers into a new array, ○ store the strings into a single large string, ○ print the contents of the integer array with each integer on a new line, ○ and then print the combined large string. ● You will also need to remove any other print statements from the program.Requirements ● You MUST write your solution to the task in program3.c. ● Your program MUST do the following: ○ Process all arguments provided to your program. ○ Remove the ./ characters from the zeroth argument of argv. ○ Convert any argument that is actually an integer from its string form into an integer using sscanf. ■ As a small hint, look at the return type of sscanf (notice that it returns the number of matches it made to your string) ○ Store any integer arguments into an array of integers. ■ For the above example, your program would generate an array: {2, 4} ■ We will not be testing strings that start with integers (e.g. 123abc). If an argument begins with an integer, you can assume that it is always an integer and not a string. ○ Store any non-integer argument into 1 large string, each argument separated by spaces ■ You MUST use either strcat or sprintf. ■ For the above example, your program would generate a string: “program3 arg1 arg3 arg5”. ○ Print the contents of your integer array with each element on a new line. ○ Print the contents of your single string. ○ Print ONLY the integers and string, without any labels or any other additions. For the above example, your program MUST print exactly: 2 4 program3 arg1 arg3 arg5● You MUST update your makefile for the following targets: ○ program3 ○ all ○ clean ○ clobber ● You MUST NOT have “debug print statements” in your program. If you choose to use print statements as part of debugging, you MUST comment them out or remove them before submission. We discourage print statements in general and suggest learning gdb as your debugging tool. ● You SHOULD comment your code since this is a programming best practice.Problem 4 (Extra Credit) – my_strtokOverview ● One of the more difficult string functions to use and to implement is strtok(). This function takes a string and “tokenizes” it; that is, it separates the original string into a series of tokens. This is similar to the Java or Python split method. ● Since this is an extra credit problem, we will provide minimal guidance. But we do have a resource in the Resources section.Requirements ● You MUST add function declarations for this function to my_string.h. ○ You MUST NOT modify my_string.h in any other way. ● You MUST add a function definition for this function to my_string.c. ● You MUST NOT use the standard library string.h in your implementations. ● You MUST implement the functions using either array notation or pointer notation. You are free to choose either method for this function. ● You MUST update your makefile for the following targets: ○ program4 ○ all ○ clean ○ clobber ● You MUST NOT have “debug print statements” in any implementation. If you choose to use print statements inside your functions, you must comment them out or remove them before submission. We discourage print statements in general and suggest learning gdb as your debugging tool. ● You SHOULD comment your code since this is a programming best practice.A Hint ● On the first call to my_strtok, we pass a string we want to tokenize and a delimiter. my_strtok must return the first token (the portion of the string up to but not including the first occurrence of the delimiter). ● On each subsequent call to my_strtok, we pass a NULL pointer instead of the string, and my_strtok must return the next token. ● When no more tokens are found, my_strtok must return NULL. This indicates that my_strtok must “remember” the string passed in when first called.Testing Your Functions ● program1.c and program2.c are for student testing only. We will not review them or test them in any way. ● Remember to test the return results from functions like my_strrev and my_strccase that both modify the string in place and also return a ptr to the modified string. ● You can assume that we will not pass any NULL pointers to your string functions. You will have to return NULL in some cases, but you do not have to check if any arguments to your functions are NULL. You can also assume that if we wish to modify the string, a string literal will not be passed. ● We recommend testing your functions against the standard library string functions (string.h). The output and return values should be the same. ● Note that my_strcat_array and my_strcat_pointer modify the original string, so you will want to test with different strings to avoid confusion between test runsSubmission Submission Checks There is a single “submission check” test that runs once you upload your code to Gradescope. This test checks that you have submitted all required files and also that your program compiles and any autograder code compiles successfully. It does not run your program or provide any input on whether it works or not. This check just ensures that all the required components exist. This test is performed after uploading to Gradescope. Consistency Checks The autograder will also show the results of six tests: 1. testStrCatArrayHelloWorld 2. testStrCatPointerHelloWorld 3. testStrchrArrayFirstChar 4. testStrchrPointerFirstChar 5. testStrcmpArraySameWord 6. testStrcmpPointerSameWord The remaining tests will be hidden until after grades are published.The Actual Submission You will submit this assignment to Gradescope in the assignment entitled Assignment 10: Strings in C. Download the required .c source and .h header files (as well as any additional helper files required) and your Makefile from Codio to your computer, then Upload all of these files to the Gradescope assignment. We expect my_string.c, my_string.h, program3.c, and makefile. Do not submit program1.c, program2.c., or program4.c. Do not not submit intermediate files (anything .o). We will only grade the last submission uploaded. Do not mark your Codio workspace complete. Only the submission in Gradescope will be used for grading purposes.Grading This assignment is worth 127.5 points, normalized to 100% for gradebook purposes. Main Assignment Problem 1 (standard functions) is worth 72 points (each function is 9 points, with equally weighted sub-tests per function). Problem 2 (custom functions) is worth 18 points (each function is 9 points, with equally weighted sub-tests per function). Problem 3 (parsing strings) is worth 10 points.Problems 1, 2, and the Extra Credit are tested with Unit Testing. We will run different scenarios for each function to validate the functionality (partial credit based on which tests fail). Problem 3 checks the final output produced by your program and compares that output to the expected output. It must match exactly for credit: double check that your program does not have any extra output. We will only grade the last submission, regardless of the results of any previous submission. We will not be providing partial credit for autograder tests.Extra Credit The Extra Credit is worth 6 percentage points so the highest grade on the assignment is 106%. Your extra credit must not break functionality for the non-extra credit requirements. There is no partial credit. It must work completely for any credit. We will not give guidance on how to do this since it is designed to be challenge problem.Hints or FAQs Hints from previous semesters ● Don’t mis-spell my_strccase ● Write many test cases. A common cause of low grades is not being thorough in testing ● Don’t forget the NULL terminator in your strings. ● You will need to add -g to your makefile for all intermediate steps to use gdb for debugging those intermediate object files. ● You no longer need to log gdb output to gdb.txt. That was only for Assignment 9. You will probably never log gdb output again and we do not expect to see this anymore.Resources strlen reference https://www.tutorialspoint.com/c_standard_library/c_function_strlen.htmstrcpy reference https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htmstrchr reference https://www.tutorialspoint.com/c_standard_library/c_function_strchr.htmstrcat reference https://www.tutorialspoint.com/c_standard_library/c_function_strcat.htmstrcmp reference https://www.tutorialspoint.com/c_standard_library/c_function_strcmp.htmsscanf reference https://www.tutorialspoint.com/c_standard_library/c_function_sscanf.htm sprintf reference https://www.tutorialspoint.com/c_standard_library/c_function_sprintf.htmstrtok reference https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htmstrtok reference (Linux manual pages) https://man7.org/linux/man-pages/man3/strtok_r.3.htmlThe const modifier http://www.geeksforgeeks.org/const-qualifier-in-c/

$25.00 View

[SOLVED] Cit593 module 11- making the lc4 assembler

Making the LC4 Assembler InstructionsContents Assignment Overview 3 Learning Objectives 3 Advice 3 Getting Started 4 Codio Setup 4 Starter Code 4 Object File Format Refresher 4 Requirements 5 General Requirements 5 Assembler 5 assembler.c: main 5 asm_parser.c: read_asm_file 6 asm_parser.c: parse_instruction 6 asm_parser.c: parse_add 6 asm_parser.c: parse_xxx 7 asm_parser.c: str_to_bin 7 asm_parser.c: write_obj_file 7 Extra Credit 8 Suggested Approach 8 High Level Overview 8 Great High Level Overview, but I really need a Slightly More Detailed Overview 10 Part 0: Setup the main Function to Read the Arguments 10 Part 1: Read the .asm File 10 Part 2: Parse an Instruction 13 Part 3: Parse an ADD Instruction 15 Part 4: Converting the binary string to an hexadecimal formatted integer 16 Part 5: Writing the .obj object file 17 Testing 18 Validate Output with PennSim 18 Files for Testing 18 Unit Testing 19 GDB for Debugging 19 Submission 20 Submission Checks 20 The Actual Submission 20 Grading 21 Assembler 21 Extra Credit 21 FAQ 23 Quick Hints 23 Formatting 23 Endianness 23 Resources 24Assignment Overview C files fall into two categories: “text” and “binary”. In this assignment you’ll work with both types by reading in a text file and writing out a binary file. You will read an arbitrary .asm file (a text file intended to be read by PennSim) and write a .obj file (the same type of binary file that PennSim would write out). Aside from reading and writing out the files, your task will be to make a mini-LC4- Assembler! An assembler is a program that reads in assembly language and generates its machine equivalent. This assignment will require a bit more programming rigor than we’ve had thus far, but now that you’ve gained a good amount of programming skill in this class and in others, it is the perfect time to tackle a large programming assignment (which is why the instructions are so many pages).Learning Objectives This assignment will cover the following topics: ● Review the LC4 Object File Format ● Read text files and process binary files ● Assemble LC4 programs into executable object files ● Use debugging tools such as GDBAdvice ● Start early ● Ask for help early ● Do not try to do it all in one dayGetting Started Codio Setup Open the Codio assignment via Canvas. This is necessary to link the two systems. You will see many files. At the top-level workspace directory, the main files are asm_parser.h, asm_parser.c, assembler.c, and PennSim.jar. Do not modify any of the directories or any file in any of the directories. Starter Code We have provided a basic framework and several function definitions that you must implement. assembler.c – must contain your main function. asm_parser.c – must contain your asm_parser functions. asm_parser.h – must contain the definition for ROWS and COLS – must contain function declarations for read_asm_file, parse_instruction, parse_reg, parse_add, parse_mul, str_to_bin, write_obj_file, and any helper function you implement in asm_parser.c test1.asm – example assembly file PennSim.jar – a copy of PennSim to check your assemblerObject File Format Refresher The following is the format for the binary .obj files created by PennSim from your .asm files. It represents the contents of memory (both program and data) for your assembled LC-4 Assembly programs. In a .obj file, there are 3 basic sections indicated by 3 header “types” = Code , Data, and Symbol: ● Code: 3-word header (xCADE, , ), n-word body comprising the instructions. ○ This corresponds to the .CODE directive in assembly.● Data: 3-word header (xDADA, , ), n-word body comprising the initial data values. ○ This corresponds to the .DATA directive in assembly.● Symbol: 3-word header (xC3B7, , ), n-character body comprising the symbol string. These are generated when you create labels (such as “END”) in assembly. Each symbol is its own section. ○ Each character in the file is 1 byte, not 2 bytes. ○ There is no NULL terminator.Requirements General Requirements ● You MUST NOT change the filenames of any file provided to you in the starter code. ● You MUST NOT change the function declarations of any function provided to you in the starter code. ● Your program MUST compile when running the command make. ● You MUST NOT have any compile-time errors or warnings. ● You MUST remove or comment out all debugging print statements before submitting. ● You MUST NOT use externs or global variables. ● You SHOULD comment your code since this is a programming best practice. ● Your program MUST be able to handle .asm files that PennSim would successfully assemble. We will not be testing with invalid .asm files. ● Your program MUST NOT crash/segmentation fault. ● You MUST provide a makefile with the following targets: ○ assembler ○ asm_parser.o ○ all, clean, clobberAssembler assembler.c: main ● You MUST not change the first four instructions already provided. ● The main function: ○ MUST read the arguments provided to the program. ■ the user will use your program like this: ./assembler test1.asm ○ MUST store the first argument into filename. ○ MUST print an error1 message if the user has not provided an input filename. ○ MUST call read_asm_file to populate program[][]. ○ MUST parse each instruction in program[][] and store the binary string equivalent into program_bin_str[][]. ○ MUST convert each binary string into an integer (which MUST have the correct value when formatted with “0x%X”) and store the value into program_bin[]. ○ MUST write out the program into a .obj object file which MUST be loadable by PennSim’s ld command. asm_parser.c: read_asm_file This function reads the user file. ● It SHOULD return an error2 message if there is any error opening or reading the file. ● It MUST read the exact contents of the file into memory, and it MUST remove any newline characters present in the file. ● It MUST work for files that have an empty line at the end and also for files that end on an instruction (i.e. do not assume there will always be an empty line at the end of the file). ● It MUST return 0 on success, and it MUST return a non-zero number in the case of failure (it SHOULD print a useful error message and return 2 on failure).asm_parser.c: parse_instruction This function parses a single instruction and determines the binary string equivalent. ● It SHOULD use strtok to tokenize the instruction, using spaces and commas as the delimiters. ● It MUST determine the instruction function and call the appropriate parse_xxx helper function. ● It MUST parse ADD, MUL, SUB, DIV, AND, OR, XOR instructions. ○ It MUST parse ADD IMM and AND IMM if attempting that extra credit. ● It MUST return 0 on success, and it MUST return a non-zero number in the case of failure (it SHOULD print a useful error message and return 3 on failure).asm_parser.c: parse_add This function parses an ADD instruction and provides the binary string equivalent. ● It MUST consider the first argument to be the full instruction. ● It MUST correctly update the opcode, sub-opcode, and register fields following the LC4 ISA. ● It SHOULD call a helper function parse_reg, but we will not be testing this function. ● It MUST return 0 on success, and it MUST return a non-zero number in the case of failure (it SHOULD print a useful error message and return 4 on failure).asm_parser.c: parse_xxx You MUST create a helper function similar to parse_add for the other instruction functions required in parse_instruction. ● They MUST consider the first argument to be the full instruction. ● They MUST correctly update the opcode, sub-opcode, and register fields following the LC4 ISA. ● They SHOULD call a helper function parse_reg, but we will not be testing this function. ● They MUST return 0 on success, and they MUST return a non-zero number in the case of failure (it SHOULD print a useful error message and return a unique error number on failure).asm_parser.c: str_to_bin This function converts a C string containing 1s and 0s into an unsigned short integer ● It MUST correctly convert the binary string to an unsigned short int which can be verified using the “0x%X” format. ● It SHOULD use strtol to do the conversion.asm_parser.c: write_obj_file This function writes the program, in integer format, as a LC4 object file using the LC4 binary format. ● It MUST create and write an empty file if the input file is empty ● It MUST change the extension of the input file to .obj. ● It MUST use the default starting address 0x0000 unless you are attempting the .ADDR extra credit. ● It MUST close the file with fclose. ● It MUST return 0 on success, and they MUST return a non-zero number in the case of failure (it SHOULD print a useful error message and return 7 on failure). ● The generated file MUST load into PennSim (and you MUST check this before submitting), and the contents MUST match the .asm assembly programExtra Credit Option 1: modify your read_asm_file function to ignore comments in .asm files. You MUST handle all types of comments for credit. Option 2: modify your program to handle ADD IMM and AND IMM instructions. Both MUST work completely for credit. Option 3: modify your program to handle the .CODE and .ADDR directives. Option 4: modify your program to handle the .DATA, .ADDR, and .FILL directives.Suggested Approach This is a suggested approach. You are not required to follow this approach as long as you follow all of the other requirements. High Level Overview Follow these high-level steps and debug thoroughly before moving on to the next. 1. Initialize all arrays to zero or ” 2. Call read_asm_file to read the entire .asm file into the array program[][]. a. Using test1.asm as an example, after read_asm_file returns: program[][] should then contain: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 ‘A’ ‘D’ ‘D’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘0’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 1 ‘M’ ‘U’ ‘L’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 2 ‘S’ ‘U’ ‘B’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 3 ‘D’ ‘I ‘V’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” 4 ‘A’ ‘N’ ‘D’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘3’ ” 5 ‘O’ ‘R’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” X 6 ‘X’ ‘O’ ‘R’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” 7 ” X X X X X X X X X X X X X X3. In a loop, for each row X in program[][]: a. Call parse_instruction, passing it the current row in program[X][] as input to parse_instruction. When parse_instruction returns, program_bin_str[X][] should be updated to have the binary equivalent (in string form). b. Call str_to_bin passing program_bin_str[X][] to it. When str_to_bin returns, program_bin[X] should be updated to have the hexadecimal equivalent of the binary string from program_bin_str[X].4. Once the loop is complete program_bin_str[][] should contain program[][] equivalent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0 ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘1’ ” 1 ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ” 2 ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ 0 ‘0’ ‘0’ ‘1’ ” 3 ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘0’ ” 4 ‘0’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘1’ ‘1’ ” 5 ‘0’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ” 6 ‘0’ ‘1’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘1’ ‘0’ ‘1’ ‘0’ ” 7 ” X X X X X X X X X X X X X X X X5. Also after the loop is complete, the array program_bin[] should contain program_bin_str[][]’s equivalent in binary (formatted in hexadecimal here): 0 0x1201 1 0x1449 2 0x1691 3 0x12DA 4 0x5283 5 0x52D2 6 0x52DA program_bin[] now represents the completely assembled program. 6. Write out the .obj file in binary using the LC4 Object File Format.Great High Level Overview, but I really need a Slightly More Detailed Overview Okay, I guess we can give some more details.Part 0: Setup the main Function to Read the Arguments Open assembler.c from the helper files; it contains the main function for the program. Carefully examine the variables at the top: char* filename = NULL ; char program [ROWS][COLS] ; char program_bin_str [ROWS][17] ; unsigned short int program_bin [ROWS] ; The first pointer variable filename is a pointer to a string that contains the text file you’ll be reading. Your program must take in as an argument the name of a .asm file. As an example, once you compile your main program, you would execute it as follows:./assembler test1.asmIn the last assignment you learned how to use the arguments passed into main. So the first thing to implement is to check argc to see if the program has received any arguments. If it does, point filename to the argument that contains the passed in string that is the file’s name. You should return from main immediately after printing an error message if the caller doesn’t provide an input file name. For example, something like this:error1: usage: ./assembler .asmStart by updating assembler.c to read in the arguments and store the filename. Compile your changes and test them before continuing.Part 1: Read the .asm File The next thing to do is to actually read the file into memory. main’s next call will beint read_asm_file (char* filename, char program [ROWS][COLS] ) ;The purpose of read_asm_file is to open the .asm file, and place its contents into the 2D array program[][]. You must complete the implementation of this function in the provided helper file asm_parser.c.Notice that it takes in the pointer to the filename that you’ll open in this function. It also takes in the two dimensional array, program, that was defined back in main.You’ll also see that ROWS and COLS are two #define’d macros in asm_parser.h. ROWS is set to 100 and COLS is set to 255. This means that you can only read in a program that is up to 100 lines long and each line of this program can be no longer than 255. When the program compiles, the compiler will replace all instances of ROWS with 100 and all instances of COLS with 255. This means you can #define these values once to avoid Magic Numbers and simplify your program.You’ll want to look at the class notes (or a C reference textbook) to use fopen to open the filename that has been passed in. Then you’ll want to use a function like fgets to read each line of the .asm file into the program[][] 2D array. Be aware that fgets will keep carriage returns (aka the newline character) and you’ll need to strip these from the input.Take a look at test1.asm file that was included in the helper file. It contains the following program:ADD R1, R0, R1 MUL R2, R1, R1 SUB R3, R2, R1 DIV R1, R3, R2 AND R1, R2, R3 OR R1, R3, R2 XOR R1, R3, R2After you complete read_asm_file and run it on test1.asm, your 2D array program[][] would contain the contents of the .asm file in this order: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 ‘A’ ‘D’ ‘D’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘0’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 1 ‘M’ ‘U’ ‘L’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 2 ‘S’ ‘U’ ‘B’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘1’ ” 3 ‘D’ ‘I ‘V’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” 4 ‘A’ ‘N’ ‘D’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘2’ ‘,’ ‘ ‘ ‘R’ ‘3’ ” 5 ‘O’ ‘R’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” X 6 ‘X’ ‘O’ ‘R’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘3’ ‘,’ ‘ ‘ ‘R’ ‘2’ ” 7 ” X X X X X X X X X X X X X XNotice there are no newline characters at the end of these lines.If reading in the file is a success, return 0 from the function. If not, return 2 from the function and print an error to the screen:error2: read_asm_file failedImplement and test this function carefully before continuing on with the assignment.Part 2: Parse an Instruction You only need to parse the following instructions: ADD, MUL, SUB, DIV, AND, OR, XOR. You do not need to implement AND IMM or AND IMM unless you want to attempt the extra credit.Once read_asm_file is working properly, go back in main, and call parse_instruction, which is also located in asm_parser.c:int parse_instruction (char* instr, char* instr_bin_str) ;Purpose, Arguments, and Return Value The purpose of this function is to take in a single row of your program[][] array and convert to its binary equivalent in text form (as a string of 1s and 0s). The argument instr must point to a row in main’s 2D array program[][]. The argument instr_bin_str must point to the corresponding row in main’s 2D array program_bin_str[][]. If there no errors are encountered the function will return a 0 and if any error occurs in this function it should print an error message such as:error3: parse_instruction failedreturn the number 3 immediately. Let’s assume you’ve called parse_instruction and instr points to the first row in your program[][] array:0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 *instr ‘A’ ‘D’ ‘D’ ‘ ‘ ‘R’ ‘1’ ‘,’ ‘ ‘ ‘R’ ‘0’ ‘,’ ‘ ‘ ‘R’ ‘1’ ”parse_instruction needs to examine this string and convert it into a binary equivalent. You’ll need to use the LC4 ISA to determine the binary equivalent of an instruction. When your function returns, the memory pointed to by instr_bin_str, should look like this:0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 *instr_bin_str ‘0’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘1’ ‘0’ ‘0’ ‘0 ‘ ‘0 ‘ ‘0’ ‘0’ ‘0’ ‘0’ ‘1’ ”Notice this isn’t actually binary, but it is the ADD instruction’s binary equivalent in text (C String) form. We will convert this string form of the binary instruction to an integer (hexadecimal) later. How to implement this function The purpose of converting the instruction to a binary string (instead of to the binary number it will eventually become), is so that you can build this string up little by little.Investigate the strtok function in the standard C string library if you haven’t already done so for the last assignment.strtok allows you to parse a string that is separated by delimiters. In this function you’ll be parsing the string pointed to by instr and you’ll be building up the string pointed to by instr_bin_str. instr will contain spaces and commas (those will be your delimiters).Your first call to strtok on the instr string should return back the instruction function: ADD, SUB, MUL, DIV, XOR, etc. The only thing common to all 26 instructions in the ISA is that the very first part of them is the instruction function (e.g. ADD). Once you determine the instruction function, you’ll call the appropriate helper function to parse the remainder of the instruction.As an example, let’s say the instruction function is ADD. Once you’ve determined the instruction function is ADD, you would call the parse_add helper function. It will take the instruction instr as an argument, but also the instr_bin_str string because parse_add will be responsible for determining the binary equivalent for the ADD instruction you are currently working on and it will update instr_bin_str.int parse_add (char* instr, char* instr_bin_str ) ;When parse_add returns, and if no errors occurred during parsing the ADD instruction, instr_bin should now be complete. At this time, you can return 0 from parse_instruction. If you encounter any errors in this function, you should print an error3 message and return 3.This is only the first instruction. main will need to do this for each row of program[][], using strtok to get the instruction function, calling the appropriate parse_xxx helper function to finish the instruction, and updating instr_bin_str appropriately.Part 3: Parse an ADD Instruction This function is specific to parsing the ADD instruction, but you will need to write a similar function for each of the different instruction functions.The helper function parse_add should be called only by the parse_instruction function. It has two char* arguments: instr and instr_bin_str.int parse_add (char* instr, char* instr_bin_str ) ;Because this function will only be called when parse_instruction encounters an ADD instruction function, instr will contain an ADD instruction and instr_bin_str should be empty.Similar to the other functions, if this function encounters no errors it will return 0 and if any error occurs it should return 4 after printing an error4 messageerror4: parse_add() failedThe purpose of this function is to populate instr_bin_str. Upon the function’s start, the binary opcode can be immediately copied into instr_bin_str[0:3]. Afterwards, strtok can tokenize the remaining string to separate out the registers RD, RS, and RT, from the instr string.For each register, call the parse_reg helper function:int parse_reg (char reg_num, char* instr_bin_str) ;This function must take a number in character form and populate instr_bin_str with the appropriate corresponding binary number. For example, if RD = R0 for the ADD instruction, the ‘0’ character would be passed in the argument reg_num. parse_reg then copies the characters 000 into instr_bin_str[4:6].parse_reg should return 5 if any errors occur after printing a standard error5 message; otherwise it returns 0 upon success.To implement the parse_reg function, consider using a switch() statement: This helper function should only parse one register at a time. Also, because it is not specific to the ADD instruction (nearly all instructions contain registers), you can call it from other functions that need their registers converted to binary. Example: parse_mul should also call parse_reg.Note that parse_add must also populate the sub-opcode field in instr_bin_str[10:12]. When parse_add returns, instr_bin_str should be complete. parse_instrunction should then return to main.You will need to create a helper function for each instruction type, use parse_add as a model. As an example, you’ll need to create parse_mul, parse_xor, etc. They will all be very similar functions, so perfect parse_add before you attempt the other functions.Part 4: Converting the binary string to an hexadecimal formatted integer After parse_instruction returns successfully to main, main should call str_to_bin:unsigned short int str_to_bin (char* instr_bin_str) ;This function should be passed the recently parsed binary string from the array program_bin_str[X], where X represents the binary instruction that was just populated by the last call to parse_instruction.The purpose of this function is to take a binary string and convert it to a 16-bit binary equivalent and return it to the calling function. To implement this function, we recommend using strtol. If strtol returns 0, print an error6 message and return 6.As an example of what this function should do, if it was called with the following argument: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 *instr_bin_str ‘0’ ‘0’ ‘0 ‘ ‘1’ ‘0 ‘ ‘0’ ‘1’ ‘0’ ‘0 ‘ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘0’ ‘1 ‘ ”then it should return: 0x1201, which is the hexadecimal equivalent for this binary string. You can verify and print out what it returns by using printf(“0x%X”), which will print out integers in hexadecimal format.Part 5: Writing the .obj object file 3-word header (xCADE, , ), n-word body comprising the instructions. This corresponds to the .CODE directive in assembly.Given this information, the last function to implement is:int write_obj_file (char* filename, unsigned short int program_bin[ROWS] ) ;The purpose of this function is to take the assembled program, represented in hexadecimal in program_bin[] and output it to a file with the extension: .obj. It must encode the file using the .obj file format specified in class. If test1.asm was pointed to by filename, your program would open up a file to write to called: test1.obj.This function should do the following: 1. Take the filename passed in the argument filename, change the last 3 letters to “obj” 2. Open up that file for writing and in binary format. The file you’ll create is not a text file, these are not C Strings you’re writing, they are binary numbers. 3. Write out the first word in the header: 0xCADE. 4. Write out the address your program should be loaded at. 0x0000 is the default. 5. Count the number of rows that contain data in program_bin[], then write out 6. Now that the header is complete, write out the rows of data in program_bin[] 7. Close the file using fclose.If any errors occur, print an appropriate error message and return 7. Otherwise return 0 and main should then return 0 to the caller. Your program is now complete.Testing Validate Output with PennSim Once you have successfully written an object file from an assembly file, examine the .obj file’s contents using the Linux utility hexdump. From the Linux terminal prompt type:hexdump test1.objhexdump will show you the binary contents. Make certain it matches your expectations!As an example, for the program described in the Suggested Approach, the expected hexdump would be: 0000000 deca 0000 0700 0112 4914 9116 da12 8352 0000010 d252 da52 0000014You must test your .obj files in PennSim before submission. If they fail to load, you should expect little, if any, credit for this assignment.It is your responsibility to test out files other than test1.asm. Also, you must test your .obj files by loading them into PennSim and seeing if they work! Please do this before submitting your work. We will be testing your programs with different .asm files, so you should try out different .asm files of your own.Files for Testing We are only providing test1.asm for testing. However, you can (and should) create additional files that test different parts of the program. For these test files, bring up PennSim, assemble it, and check the .obj file contents with hexdump. Then read it into your program and see if you can assemble it into the same object file. You can create a bunch of test cases very easily with PennSim. You should test your assembler program on a variety of .asm files, not just simple examples.Unit Testing When writing such a large program, it is a good strategy to “unit test.” This means, as you create a small bit of working code, compile it, and create a simple test for it. DO NOT write the entire program, compile it, and then start testing it. You will never resolve all of your errors this way. You need to unit test your program as you go along or it will be impossible to debug.GDB for Debugging gdb allows you to inspect the actual contents of memory which is an advantage over print statements because print statements only print ASCII characters. Further, you can see the actual contents of memory of any variable at any time, while print statements only print when you call the print statement during the execution of your program. Reminder: you will need to add the -g flag to all intermediate compilation steps, not just the assembler target, and you will need to use the –args command to tell gdb that you have arguments to your program: gdb -q -tui –args ./assembler test1.asm Submission Submission Checks There is a single “submission check” test that runs once you upload your code to Gradescope. This test checks that you have submitted all four required files and also that your program compiles and any autograder code compiles successfully. It does not run your program or provide any input on whether it works or not. This check just ensures that all the required components exist. This test is performed after uploading to Gradescope.The Actual Submission You will submit this assignment to Gradescope in the assignment entitled Assignment 11: File I/O, Making the LC4 Assembler. Download all of your .c source and .h header files and your Makefile from Codio to your computer, then Upload all four of these files to the Gradescope assignment. You should not submit any of the provided or your own .asm testing files. We will only grade the last submission uploaded.Grading We will only grade the last submission, regardless of the results of any previous submission. We will not be providing partial credit for autograder tests. Assembler We do provide one example that we will test with, so you can be sure to get those points. You will have to figure out the rest yourself. This assignment is worth 200 points, normalized to 100% for gradebook purposes. 20 points: correct makefile 30 points: general code inspection (manually graded) 10 points: handling command line arguments and writing the correct file 20 points: correctly handle endianness 60 points: correctly processing test1.asm (which we provide to you) 60 points: correctly processing our other test files (which we do not provide to you)Extra Credit The Extra Credit is worth 11 percentage points so the highest grade on the assignment is 111%. Your extra credit must not break functionality for the non-extra credit requirements. Make a backup of your finalized program before attempting the extra credit. If your program fails to meet the basic requirements, you will end up losing more points than the extra credit will gain. There is no partial credit. It must work completely for any credit. We will not give guidance on how to do these since they are designed to be challenge problems.2 percentage points: modify your read_asm_file function to ignore comments in .asm files. You must handle all types of comments for credit. 2 percentage points: modify your program to handle ADD IMM and AND IMM instructions. Both must work completely for credit (no partial credit for one instruction). 5 percentage points: modify your program to handle the .CODE and .ADDR directives. As a hint, you will need another array to hold the addresses, e.g. unsigned short int address[ROWS]. 2 percentage points: modify your program to handle the .DATA directive.FAQ Quick Hints ● You are allowed to use the switch statement, a compact way to handle long if/then blocks. ● We won’t be testing with string literals in the unit testing. ● You do not need a script file, but you can certainly add one to automate your own testing. ● We will only be testing with valid .asm files. That is, all the test files will assemble correctly in PennSim. ● You can raise an error if the register number for Rx is not valid (e.g. R8), but again, we will not be testing with invalid files.Formatting ● We will not test with blank lines between instructions, even though PennSim can assemble these without error. ● We do not expect you to use a regex to check if the instruction matches a format. ● Lines can end with trailing spaces, a newline, or just EOF (end of file) if it is the last line in the .asm file. ● strtok is sufficient to break the instruction into the different parts (hint: use a delimiter of ” ,”). The Assignment 10 instructions has a link to a good resource. ● All characters will be uppercase, except for x for hexadecimal values (which only applies to some of the extra credit challenges and they will never be X), even though PennSim can assemble these without error.Endianness ● The x86 (the processor used by Codio) has a different endianness than the LC4. When doing fread()’s of 2 byte words, swapping occurs to adjust for this. That same swapping doesn’t occur with the fgetc() or fread()’s with size 1. ● If you read the .obj file into memory one word at a time using fread(), you will need to swap for endianness. In contrast, if you choose to read the .obj file into memory one byte at a time with fgetc(), the endianness doesn’t need to be adjusted. However, you will have to combine two bytes into a word using bitwise operators.Resources ● strtok reference https://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm ● switch statement reference https://www.tutorialspoint.com/cprogramming/switch_statement_in_c.htm ● strtol reference https://www.tutorialspoint.com/c_standard_library/c_function_strtol.htm

$25.00 View

[SOLVED] Ai6126 project 1

CelebAMask Face ParsingProject 1 Specification Important Dates Development Phase: 21 Feb 2025 12:00 AM – 21 Mar 2025 12:00AM (UTC+8) Test Phase: 21 Mar 2025 12:00AM – 28 Mar 2025 12:00 AM (UTC+8)Group Policy This is an individual projectChallenge Description Face parsing assigns pixel-wise labels for each semantic components, e.g., eyes, nose, mouth. The goal of this mini challenge is to design and train a face parsing network. We will use the data from the CelebAMask-HQ Dataset [1] (See Figure 1). For this challenge, we prepared a mini-dataset, which consists of 1000 training and 100 validation pairs of images, where both images and annotations have a resolution of 512 x 512. The performance of the network will be evaluated based on the F-measure between the predicted masks and the ground truth of the test set (the ground truth of the test set will not be released).Figure 1. Sample images in CelebAMask-HQAssessment Criteria We will evaluate and rank the performance of your network model on our given 100 unseen test images based on the F-measure.The higher the rank of your solution, the higher the score you will receive. In general, scores will be awarded based on the Table below.Percentile in ranking ≤ 5% ≤ 15% ≤ 30% ≤ 50% ≤ 75% ≤ 100% * Scores 20 18 16 14 12 10 0 Notes:Submission Guideline ● Download dataset: this link ● Train and test your network using our provided training set. ● [Optional] Evaluate your model on an unseen CodaBench validation set during development, with up to 5 submissions per day and 60 in total.Restrictions ● To maintain fairness, your model should contain fewer than 1,821,085 trainable parameters, which is 120% of the trainable parameters in SRResNet [2] (your baseline network). You can use sum(p.numel() for p in model.parameters()) to compute the number of parameters in your network. ● No external data and pretrained models are allowed in this mini challenge. You are only allowed to train your models from scratch using the 1000 image pairs in our given training dataset. ● You should not use an ensemble of models.Step-by-step Submission Procedure We host the validation and test sets on CodaBench. Please follow the guidelines to ensure your results to be recorded. The website of the competition is https://www.codabench.org/competitions/5726 1. Register the CodaBench account with your NTU email (ends with @e.ntu.edu.sg), with your matric number as your username. 2. Register for this competition and waits for approval.3. Submit a file with your prediction results as follows. Include source code and pretrained models in the test phase; not required for the dev phase.1) A short PDF report (max five A4 pages, Arial 10 font) detailing your model, loss functions, and any processing steps used to obtain results. Include the Fmeasure on the test set and the total number of model parameters. Name your report as: [YOUR_NAME]_[MATRIC_NO]_[project_1].pdf 2) A screenshot from the CodaLab leaderboard, with your username and best score. We will use the score from CodaLab for marking, but will keep your screenshot here for double-check reference.Computational ResourceReferences [1] Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo, MaskGAN: Towards Diverse and

$25.00 View

[SOLVED] Cs6250_25fall – spanning tree

CS 6250 Spring 2025Spanning Tree Table of Contents PROJECT GOAL 2 Part 1: Setup 2 Part 2: Files Layout 2 Part 3: TODOs 3 Part 4: Testing and Debugging 5 Part 5: Assumptions and Clarifications 6 What to Turn In 7 What you can and cannot share 7 Rubric 8PROJECT GOAL Part 1: Setup Download the project files from Canvas. You can do this project on your host system if it has Python 3.11.x. The project does not have any dependencies outside of Python. You must be sure that your submission runs properly in Gradescope. Gradescope is the environment where your project will be graded. Gradescope and the VM are the only valid environments for this course. Part 2: Files Layout There are many files in the SpanningTree directory, but you should only modify Switch.py. The files in the project skeleton are described below. DO NOT modify these files. All of your work must be in Switch.py ONLY. You should study the other files to understand the project framework. • Topology.py – Represents a network topology of layer 2 switches. This class reads in the specified topology and arranges it into a data structure that your Switch can access. This class also adjusts the topology if any changes are indicated within the XXXTopo.py class. • Message.py – This class represents a message format you will use to communicate between switches, similar to the course lectures. Specifically, you will create and send messages in Switch.py by declaring a message as: msg = Message(claimedRoot, distanceToRoot, originID, destinationID, pathThrough, timeToLive)• run.py – A “main” file that loads a topology file (see XXXTopo.py below), uses that to create a Topology object containing Switches, and runs the simulation. • XXXTopo.py, etc. – These are topology files that you will pass as input to the run.py file. Part 3: TODOs This is an outline of the code you must implement in Switch.py with suggestions for implementation. Keep in mind that certain update rules will take precedence over others. A. Decide on the data structure(s) that you will use to keep track of the spanning tree.1. The collection of active links across all switches is the resulting spanning tree. 3. This is a distributed algorithm. The switch can only communicate with its direct neighbors. It does not have an overall view of the topology as a whole (do not access self.topology). 4. An example data structure should include, at a minimum: a. a variable to store the switch ID that this switch sees as the root, b. a variable to store the distance to the switch’s root, c. a list or other datatype that stores the “active links” (only the links to neighbors that are in the spanning tree). d. a variable to keep track of which neighbor it goes through to get to the root (a switch should only go through one neighbor, if any, to get to the root).B. Implement processing a message from an immediate neighbor.1. You do not need to worry about sending the initial messages. You only need to worry about the sending and processing of subsequent messages. 2. For each message a switch receives, the switch will need to: a. Determine whether an update to the switch’s root information is necessary and update accordingly. I. The switch should update the root stored in its data structure if it receives a message with a lower claimedRoot. II. The switch should update the distance stored in its data structure if a) the switch updates the root, or b) there is a shorter path to the same root. b. Determine whether an update to the switch’s active links data structure is necessary and update accordingly. The switch should update the activeLinks if: I. The switch finds a new path to the root (through a different neighbor). In this case, the switch should add the new link to activeLinks and removes the old link from activeLinks II. The switch receives a message with pathThrough = TRUE but does not have that originID in its activeLinks list. In this case, the switch should add originID to its activeLinks list. III. The switch receives a message with pathThrough = FALSE but the switch has that originID in its activeLinks. In this case, the switch should remove originID from its activeLinks list c. Determine when the Switch should send messages to its neighbors and send the messages. I. The message FIFO queue is maintained in Topology.py. The switch implementation does not interact with the FIFO queue directly, but uses the send_message function, and receives messages as arguments in the process_message function. II. When sending messages, pathThrough should only be TRUE if the destinationID switch is the neighbor that the originID switch goes through to get to the claimedRoot. Otherwise, pathThrough should be FALSE. III. The switch should continue sending messages to its neighbors until the ttl (time to live) on the Message being processed is 0. You need to decrement the ttl every time you process a Message. Note: This is one place where this project deviates from the STP algorithm you learned in the lectures. a. The switch that is dropped should never split the original topology. That means that the final Topology will remain connected. b. The switch that is dropped could be the original root, your algorithm should adapt accordingly. c. The Topology file will include the ttl_limit and drops. The ttl_limit is the starting ttl for each message in the Topology. The drops indicate which switch(es) will be dropped. d. You do not need to access the ttl_limit. This will be given to each message at the start of the process. You need to decrement the ttl to 0 to trigger the Topology’s drop process. C. Write a logging function. 1. The switch should only output the links that are in the spanning tree. 2. Follow the below format (# – #). Unsorted or non-standard formatting will result in penalties. Examples of correct logs with the correct format have been provided to you in the project directories. 3. Sorted: Not sorted: 1 – 2, 1 – 3 1 – 3, 1 – 2 2 – 1, 2 – 4 2 – 4, 2 – 1 3 – 1 3 – 1 4 – 2 4 – 2Part 4: Testing and Debugging To run your code on a specific topology (SimpleLoopTopo.py in this case) and output the results to a text file (out.txt in this case), execute the following command: python run.py SimpleLoopTopo “SimpleLoopTopo” is not a typo in the example command – don’t include the .py extension. We have included several topologies with correct solutions for you to test your code against. You can (and are encouraged to) create more topologies and test suites with output files and share them on Ed Discussion. There will be a designated post where students can share these files. You will only be submitting Switch.py – your implementation must be confined to modifications of that file. We recommend testing your submission against a clean copy of the rest of the project files prior to submission. Part 5: Assumptions and Clarifications A. All switch IDs are positive integers, and distinct. 1. These integers do not have to be consecutive. 2. They will not always start at 1. 3. There is no maximum value beyond language (Python) limitations (but your code does not need to check for this). B. Tie breakers: If there are multiple paths of equal distance to the same root, the switch should choose the path through the neighbor with the lowest switch ID. 1. Example: switch 5 has two paths to root switch 1, through switch 3 and switch 2. Each path is 2 hops in length. Switch 5 should select switch 2 as the path to the root and disable forwarding on the link to switch 3.C. There is a single distinct solution spanning tree for each topology. This is guaranteed by the first two assumptions (A and B). D. All switches in the network will be connected to at least one other switch, and all switches are able to reach every other switch. It will always be possible to form a tree that spans the entire topology. E. There will be only 1 link between each pair of directly connected switches. You do not need to consider how STP would behave with redundant links. G. The solution implemented in Switch.py should terminate without intervention. When there are no more messages in the queue to process, the simulation will log the output and terminate. Your algorithm should stop sending messages when the ttl on the Message being processed is 0. H. Your solution should not require any outside Python modules. Do not import any other modules. What to Turn In Before submission: a. Make sure your logging format is correct. Invalid format will be marked as incorrect. b. Remove all print statements from your code before turning it in. Print statements can have drastic effects on runtime. Your submission must take less than 30 seconds per topology. If print statements in your code adversely affect the grading process, your work will not receive full credit. c. Your algorithm must converge upon the Spanning Tree within the Topology’s ttl_limit. d. Make sure your Switch.py works in Gradescope. Gradescope will give you immediate feedback, along with your grade, so we will not accept re-grade requests related to incorrect submissions. f. Helper functions: Helper functions are fine as long as the names don’t conflict with anything already in the project. If it works in Gradescope, it is fine. After submission: h. Your grade in Gradescope will be your grade for this project, with some caveats: b. Any attempt to bypass or distort the autograder will result in a 0 and will be referred to OSI. What you can and cannot share Rubric10 pts Correct Submission For turning in the correct file with the correct name. You receive 10 FREE points for reading the instructions. 30 pts Provided Topologies For correct Spanning Tree results (log files) on the provided topologies. 60 pts Hidden Topologies For correct Spanning Tree results (log files) on the four topologies that you will not have access to. These cases are used to prevent students from hard coding a solution.

$25.00 View

[SOLVED] Cs6250_25fall project- distance vector

CS 6250 Spring 2025Distance Vector Table of Contents PROJECT GOAL 2 Part 0: Getting Started 2 Part 1: Files Layout 2 Part 2: TODOs 3 Part 3: Testing and Debugging 4 Part 4: Assumptions and Clarifications 4 Part 5: Correct Logs for Provided Topologies 6 Part 6: Spirit of the Project 7 Part 7: FAQs 8 What to Turn In 9 What you can and cannot share 9 Rubric 10PROJECT GOAL In the lectures, you learned about Distance Vector (DV) routing protocols, one of the two classes of routing protocols. DV protocols, such as RIP, use a fully distributed algorithm to find shortest paths by solving the Bellman-Ford equation at each node. In this project, you will develop a distributed Bellman-Ford algorithm and use it to calculate routing paths in a network. This project is similar to the Spanning Tree project, except that we are solving a routing problem, not a switching problem. In “pure” distance vector routing protocols, the hop count (the number of links to be traversed) determines the distance between nodes. Some distance vector routing protocols, that operate at higher levels (like BGP), must make routing decisions based on business valuations. These protocols are sometimes referred to as Path Vector protocols. We will explore this by using weighted links (including negatively weighted links) in our network topologies. We can think of Nodes in this simulation as individual Autonomous Systems (ASes), and the weights on the links as a reflection of the business relationships between ASes. Links are directed, originating at one Node, and terminating at another. Part 0: Getting Started You should review some materials on Bellman-Ford. Some resources include: • Wikipedia (https://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm) • “Computer Networking: A Top-Down Approach” by Kurose and Ross o 7th edition discusses the algorithm on pages 384-385 in Chapter 5 (“The Network Layer: Control Plane”) Download and unzip the Project Files for Distance Vector from Canvas in the Assignments section. This project can be completed in the class VM or on your local machine using Python 3.10.x. You must be sure that your submission runs properly in Gradescope. Part 1: Files Layout The DistanceVector directory contains the following files: • DistanceVector.py – This is the only file you will modify. It is a specialization (subclass) of the Node class that represents a network node (i.e., router) running the Distance Vector algorithm, which you will implement. • Node.py – Represents a network node, i.e., a router. • Topology.py – Represents a network topology. It is a container class for a collection of DistanceVector Nodes and the network links between them. • run_topo.py – A simple “driver” that loads a topology file (see *Topo.txt below), uses that data to create a Topology object containing the network Nodes, and starts the simulation. • *Topo.txt – These are valid topology files that you will pass as input to the run.sh script (see below). Topologies should end with “.txt”. • BadTopo.txt – This is an invalid topology file, provided as an example of what not to do, and so you can see what the program says if you pass it a bad topology. • output_validator.py – This script can be run on the log output from the simulation to verify that the output file is formatted correctly. It does not verify that the contents are correct, only the format. • run.sh – A helper script that runs some basic system checks, the topology, and the validator, a wrapper for run_topo.py and output_validator.py . Part 2: TODOs There are a few TODOs in DistanceVector.py: A. Review the methods already implemented in Node.py. a. Because DistanceVector is a subclass of Node, consider how you might use the existing methods to complete the TODOs in this list. b. Do NOT modify Node.py. B. Decide on how each node will represent its distance vector. a. Consider what might be the simplest data structure to keep track of path weights (i.e., the distance vector). b. The distance vector variable should be local to the Node, i.e., defined in the init function as a variable accessible via the `self` object (i.e. self.mylist). C. Implement the Bellman-Ford algorithm. a. Each Node will: i. send out an initial message to its neighbors ii. process messages received from other nodes iii. send updates to other nodes as needed b. Initially, a node only knows of: i. itself and that it is reachable at cost 0, ii. its neighbors and the weights on its links to its neighbors c. NOTE: a node’s links are unidirectional. d. NOTE: The Bellman-Ford algorithm implementation should terminate naturally without external intervention. D. Write a logging function that is specific to your distance vector structure. a. You should use the self.add_entry function to help with logging. b. You should assume that the logging function only knows itself. i. Do NOT access the topology for logging; logging should happen at the Node level. Part 3: Testing and Debugging To run your algorithm on a specific topology, execute the run.sh bash script: ./run.sh *Topo Substitute the correct, desired filename for *Topo. Don’t use the .txt suffix on the command line. This will execute your implementation of the algorithm in DistanceVector.py on the topology defined in *Topo.txt and log the results (per your logging function) to *Topo.log . NOTE: You should not include the full filename of the topology when executing the run.sh script. For example, to run the algorithm on topo1.txt you should only specify topo1 as the argument to run.sh. We’ve included four good topologies for you to use in testing and one bad topology to demonstrate invalid topology. The provided topologies do not cover all the edge cases; your code will be graded against more complex topologies. Part 4: Assumptions and Clarifications A. Node behavior: i. Example: Node B has an incoming link from Node A, but has no outgoing link to Node A, Node B will send its distance vector to node A to “advertise” other nodes it can reach (Nodes C and D).b. A Node’s distance vector is comprised of the nodes it can reach via its outgoing links (including to itself at distance = 0). i. A Node will never advertise a negative distance to itself. (Important for negative cycles.) c. A Node advertises its distance vector to its upstream neighbors. d. Nodes do not implement poison-reverse. B. Edge and Path weights: b. The edge weight value type is an integer. c. There is no upper limit for path weights. d. The lower limit for path weights is “-99”, which is equivalent to “negative infinity” for this project. C. Negative cycles: a. A Node can forward traffic through a negative cycle. b. Negative cycles are a series of directed links that originate and terminate at a single node, where the sum of the link weights is less than 0. i. This can lead to a negative “count-to-infinity” problem. Therefore, your implementation must be able to detect negative cycles to terminate on its own. ii. Any node that can reach a destination node and infinitely traverse a negative cycle enroute will set the distance to that node to -99. 1. Your implementation only needs to detect and record these traversals appropriately; it does not need to mitigate them. iii. A Node can advertise a negative distance for other nodes (but not for itself). iv. A Node that receives an advertisement with a distance of -99 from a downstream neighbor should also assume that it can reach the same destination at infinitely low cost (-99). v. Example: Traffic from Node F to Node D can route through A->B->C->A indefinitely to reach an extremely low (very negative) value.c. A Node will not forward traffic destined to itself. i. Example: The below topology will not result in a count-to-infinity problem, as there are no possible pairs of source and destination nodes where traffic could indefinitely traverse a negative cycle. Node A will not forward traffic for Node A, and similarly for Nodes B and C.D. Topologies used in grading: a. We will be using many topologies to test your project. This includes but is not limited to: o topologies with and without cycles (loops), including odd length cycles o topologies of varying sizes, including topologies with more than 26 nodes o topologies with nodes with names longer than one character o topologies with multiple paths to different nodes o topologies that include any combination of positive weights, zero weight, and negative weight o topologies with Nodes that do not have incoming or outgoing links ▪ All nodes will be connected but: b. We will NOT test your submission against the following topologies (which means your algorithm does not need to account for them): o topologies with more than one link from the same origin to the same destination (multi-graphs) o topologies with portions of the network disconnected from each other (partitioned networks) o topologies that do not require intermediate steps (such as a topology with a single node) o topologies with a valid path between two indirectly linked nodes with no cycle with an actual total cost of ≤ -99 (topologies will respect that -99 is “negative infinity” for this project) Part 5: Correct Logs for Provided Topologies Below are the correct final logs for the provided topologies. We are providing them to help you identify correct behavior with respect to negative cycles and the assumptions in the instructions. We are only providing the final round; each topology should produce at least 2 rounds of output. SimpleTopo: A:(A,0) (B,1) (C,3) (D,3) B:(B,0) (A,1) (C,2) (D,2) C:(C,0) (B,2) (A,3) (D,0) D:(D,0) (C,0) (B,2) (A,3) E:(E,0) (D,-1) (C,-1) (B,1) (A,2)SingleLoopTopo: A:(A,0) (D,5) (E,6) (B,6) (C,16) B:(B,0) (A,2) (D,7) (C,10) (E,0) C:(C,0) D:(D,0) (E,1) (B,1) (A,3) (C,11) E:(E,0) (B,0) (A,2) (D,7) (C,10)SimpleNegativeCycle: AA:(AA,0) (AD,-2) (AE,-1) (AB,0) (CC,-99) AB:(AB,0) (AA,-1) (AD,-3) (CC,-99) (AE,-2) AD:(AD,0) (AE,1) (AB,2) (AA,1) (CC,-99) AE:(AE,0) (AB,1) (AA,0) (AD,-2) (CC,-99) CC:(CC,0) (AB,0) (AA,-1) (AD,-3) (AE,-2)ComplexTopo: ATT:(ATT,0) (CMCT,-99) (TWC,-99) (GSAT,-8) (UGA,-99) (VONA,-11) (VZ,-3) CMCT:(CMCT,0) (TWC,-99) (ATT,1) (VONA,-10) (GSAT,-7) (UGA,-99) (VZ,-2) DRPA:(DRPA,0) (EGLN,1) (GT,-1) (UC,-1) (CMCT,-99) (TWC,-99) (ATT,13) (OSU,-1) (VONA,2) (GSAT,5) (UGA,-99) (PTGN,1) (VZ,10) EGLN:(EGLN,0) (GT,-2) (UC,-2) (DRPA,1) (CMCT,-99) (OSU,-2) (TWC,-99) (ATT,13) (PTGN,0) (VONA,3) (GSAT,5) (UGA,-99) (VZ,11) GSAT:(GSAT,0) (VONA,-3) (VZ,5) (UGA,-99) (ATT,7) (CMCT,-99) (TWC,-99) GT:(GT,0) (UC,0) (EGLN,2) (OSU,0) (DRPA,3) (PTGN,2) (CMCT,-99) (VONA,5) (TWC,-99) (ATT,15) (VZ,13) (GSAT,7) (UGA,-99) OSU:(OSU,0) (UC,0) (GT,0) (EGLN,2) (PTGN,2) (VONA,5) (DRPA,3) (VZ,13) (GSAT,7) (CMCT,-99) (ATT,15) (UGA,-99) (TWC,-99) PTGN:(PTGN,0) (OSU,-1) (UC,-1) (GT,-1) (EGLN,1) (VONA,3) (VZ,11) (GSAT,5) (DRPA,2) (ATT,13) (UGA,-99) (CMCT,-99) (TWC,-99) TWC:(TWC,0) (CMCT,-99) (ATT,1) (VONA,-10) (VZ,-2) (GSAT,-7) (UGA,-99) UC:(UC,0) (GT,0) (EGLN,2) (OSU,0) (PTGN,2) (DRPA,3) (VONA,5) (CMCT,-99) (VZ,13) (GSAT,7) (TWC,-99) (ATT,15) (UGA,-99) UGA:(UGA,0) (ATT,50) (CMCT,-99) (TWC,-99) (GSAT,42) (VONA,39) (VZ,47) VONA:(VONA,0) (VZ,8) (GSAT,2) (ATT,10) (UGA,-99) (CMCT,-99) (TWC,-99) VZ:(VZ,0) (ATT,2) (CMCT,-99) (TWC,-99) (GSAT,-6) (UGA,-99) (VONA,-9)Part 6: Spirit of the Project The goal of this project is to implement a simplified version of a network protocol using a distributed algorithm. This means that your algorithm should be implemented at the network node level. Each network node only knows its internal state, and the information passed to it by its direct neighbors. Declaring global variables will be a violation of the spirit of the project. Part 7: FAQs A: Your solution should not require any outside Python modules. Please do not import any other modules. Q: What is the best way to format and process node messages? A: There is no right or wrong way to format messages. For best results keep things simple. Q: Is it required that the distance vectors displayed in my log files be alphabetized? A: Look at the finish_round function in Toology.py. Note how the DVs are alphabetized each round, and this is reflected in the provided correct output logs. The nodes within individual vectors are not required to be sorted. Q: Should my solution include an implementation of split horizon? A: That is not a requirement for this project. Q: What if there really is a valid path between two indirectly linked nodes with no cycle and the total cost is -99 or less? A. We will not test your submission against a topology that does this. However, from the “Assumptions and Clarifications”, note: “a Node seeing an advertised vector of -99 from a downstream neighbor can assume this means it can reach that same destination at infinitely low cost (-99).” What to Turn In To complete this project, submit ONLY your DistanceVector.py file to Gradescope as a single file. Do not modify the name of DistanceVector. You can make an unlimited number of submissions to Gradescope. Your last submission will be your grade unless you activate a different submission.There are some very important guidelines for this file you must follow: A. Ensure that your submission self-terminates. If your submission runs indefinitely (i.e., contains an infinite loop) or throws an error at runtime, it will not receive full credit. Manually killing your submission via console commands or interrupts is NOT an acceptable means of termination. B. Remove any print statements from your code before turning it in. Print statements left in the simulation, particularly for inefficient but logically sound implementations, have drastic effects on run-time. Ideally, your submission should take less than 10 seconds to process a topology. If your leave print statements in your code and they adversely affect the grading process, your work will not receive full credit. (Feel free to use print statements during the project and during debugging but remove them before you submit.) C. Ensure your logs are formatted properly. Logging is the only way that we can verify that your algorithm is running correctly. The output validator will catch most formatting mistakes, but you should inspect your output manually to make sure it matches the requested format. (See the TODO comment for logging located in DistanceVector.py for format details.) D. Ensure your solution generates completely correct output. Partial credit for individual topologies will not be awarded, even if the distance vector logs are “mostly correct.” E. Check your submission after uploading. As usual, we do not accept resubmissions past the stated deadlines. What you can and cannot share When sharing log files, leave alphabetization on so that your classmates can use the diff tool to see if you are getting the same log outputs as they are. Rubric40 pts Provided Topologies (4 total) For correct Distance Vector results (log file) on the provided topologies. 60 pts Unannounced Topologies (4 total) For correct Distance Vector results (log file) on topologies that you will not see in advance. They are slightly more complex than the provided ones and test some edge cases.GRADING NOTE: There is no partial credit for individual topologies; each topology is either “passed” or “failed”.

$25.00 View

[SOLVED] Cs6250_25fall extra credit project – internet wide events

Please respect the intellectual ownership of the course materials. Contents Internet Wide Events – Extra Credit Project 1 Project Support Limited 1 Goal 2 Task 1 2 Where to find events 2 Task 1 deliverables 2 Task 2 3 Task 2 deliverables 3 Resources 3 What to Submit 3 Rubric 4 What you can and cannot share 5Project Support Limited For this project, we provide limited support. There will not be any chat sessions, and we will only run one office hour to answer questions about the project. There will not be an office hour during the week of Exam 2. This is consistent with how we have traditionally handled extra credit projects in Computer Networks. You are expected to apply what you learned in the BGPM project in this extra credit assignment. For this assignment you are expected to work independently, but you can share ideas and charts on Edstem.Goal The goal of this project is to identify major events that have large scale impact on Internet connectivity for individual networks or even entire countries. In this project we will learn: 1. How to leverage tools and resources, so that we can understand how a large-scale event is reflected on Internet connectivity data 2. How to perform measurements so we can measure multiple aspects of the event’s impact Task 1 First you will need to find public information about an event that had an impact on Internet connectivity for individual AS or entire countries. Example types of events are DDoS attacks, prefix hijacking, political developments, social unrest, social media censorship, an earthquake, or other physical phenomena. Where to find events Below we list resources that can help you find information about similar events. Of course, you are more than welcome to expand your search on the news or on additional tools and resources. • Ioda Internet Outage Detection and Analysis provides multiple events, and a feed as well. We can use the Ioda system to find information specifically for the Myanmar event and impacted ASes MPT (AS9988) and Mytel (AS136255) as shown here.• Oracle Internet Intelligence. Using the Oracle Internet Intelligence to learn about the Myanmar event as shown here.• Google traffic disruptions We can leverage the Google traffic disruptions to find out more about the Myanmar event as shown here.• Netblocks.org provides reports of disruption events. Task 1 deliverables 1. Describe in a short essay of 10-15 sentences the event that took place. 2. Identify the time period the event took place. 3. Identify the AS numbers (and associated entities or organization names) of the networks that were involved/impacted. 4. Identify a metric that is associated with the control plane behavior of one impacted AS. Briefly justify why your metric is relevant. Task 2 Study the control plane behavior of the impacted AS before, during and after the event took place. Task 2 deliverables 1) Use the PyBGPstream library to study/track this metric. Example metrics are the number of prefixes that are advertised by an origin AS, the duration between Announcement and Withdrawal for a prefix, the AS path and any changes it is associated with, change of origin AS for a prefix, advertisements with conflicting multiple origin AS for a single prefix, etc. Of course, feel free to come up with your own metric that better reflects the behavior of the AS you are studying. We will need to be able to run your code and reproduce your result. Submitting hardcoded values (instead of code that pulls data, processes it, and produces your metric values) is not acceptable and will result in a 0 for the entire project. If you need a code snippet example to get started, the below example code snippet shows how to grab data for a specific time period. https://bgpstream.caida.org/docs/tutorials/pybgpstream#moas2) Show a line graph with the metric of your choice before, during and after the event took place. The goal of this graph is to show that an aspect of the control plane behavior of a network is clearly atypical during the event. So, the x-axis of the line graph will reflect time (in a timescale of your choice), and the y-axis will show your metric. Resources • PyBGPstream • Public service for IP to AS mapping. • Python library for efficient longest prefix matching if you are given a list of origin AS and prefixes and you want a quick lookup. What to Submit To submit this project, submit your Jupyter Notebook file to the Canvas assignment page. Per the Edstem post on “How to Submit”, you will submit your Jupyter Notebook file directly to Canvas as GTLogin_iweec.ipynb — do not zip the file! Tip on Using Python notebook: Opened it in VSCode and install the Jupyter plugin for VSCode. NOTE: GTLogin should be replaced with your ID you use to log into Canvas (e.g., smith7 as in smith7_iweec.ipynb) 1. Your submission must have all the graphs generated on the page before saving! 2. Make sure your Jupyter Notebook file works in the virtual machine! That is the environment we will use to grade it. Rubric NOTE: The rubric below reflects a 100-point scale that will be adjusted to be the corresponding X% of potential extra credit.1. 25 points 2. 5 points 3. 5 points 4. 5 points 1. 50 points 2. 10 pointshttps://policylibrary.gatech.edu/student-affairs/academic-honor-code We strictly enforce Section 3. Student Responsibilities including these prohibited actions: – Unauthorized Access: Possessing, using, or exchanging improperly acquired written or verbal information in the preparation of a problem set, laboratory report, essay, examination, or other academic assignment. – Unauthorized Collaboration: Unauthorized interaction with another Student or Students in the fulfillment of academic requirements. – False Claims of Performance: False claims for work that has been submitted by a Student. What you can and cannot share 1. For this project, you can share graphs generated by the experiments 3. You are not permitted to share the raw data or code from which these graphs are generated 4. You cannot share your Jupyter Notebook ipynb file! 5. Also, do not share any data you record during experiments

$25.00 View

[SOLVED] Cs6250_25fall project- bgp measurements spring 2025

Table of Contents Motivation 2 Introduction 2 Project Overview and Background 3 Required Background 3 Read the resources 4 Run Example Code Snippets 4 Important Note 4 Familiarize Yourself with the BGP Record Format and BGP Attributes 5 Update Example 6 RIB Example 6 Setup 7 Cache Files / Snapshots 7 Task 1. Understanding BGP Routing table Growth 8 Task 1A: Unique Advertised Prefixes Over Time 8 Task 1B: Unique Autonomous Systems Over Time 8 Task 1C: Top-10 Origin AS by Prefix Growth 9 Task 2: Routing Table Growth: AS-Path Length Evolution Over Time 10 Task 3: Announcement-Withdrawal Event Durations 12 Task 4: RTBH Event Durations 13 Submission 15 Grading Rubric 15Motivation In this assignment, we will explore Internet Measurements, a field of Computer Networks which focuses on large scale data collection systems and techniques that provide us with valuable insights and help us understand (and troubleshoot) how the Internet works. There are multiple systems and techniques that focus on DNS measurements, BGP measurements, topology measurements, etc. There are multiple conferences in this area, which we invite you to explore and keep up with the papers that are published. The IMC conference is one of the flagship conferences in this area: ACM Internet Measurement ConferenceA gentle introduction into the Internet Measurement field is to work with large scale BGP measurements and data to study topics such as: • Characterizing growth of the Internet using various measures, such as number of advertised prefixes, the number of Autonomous Systems, the percentage growth of prefixes advertised by Autonomous System, and the dynamics of Autonomous System path lengths • Inferring problems related to short-lived Announcement and Withdrawals, • Inferring possible DDoS attacks by identifying community countermeasures such as “Remote Triggered Blackholing” Introduction In this project we will use the BGPStream tool and its Python interface PyBGPStream to understand the BGP protocol and interact with BGP data. The goal is to gain a better understanding of BGP and to experience how researchers, practitioners, and engineers have been using BGPStream to gain insight into the dynamics of the Internet. If you are interested in going deeper, you can use these same tools to observe and analyze real-time BGP data or download and analyze other historical BGP data.Project Overview and Background The zip file accompanying this assignment contains the code and data needed to implement the functions in the file bgpm.py. You will submit only bgpm.py to Gradescope and all your code for the project must be contained within bgpm.py.This project description, in combination with the comments in bgpm.py, comprise the complete requirements for the project. There are two complete sets of data included in the zip file and the provided test harness in check_solution.py will test each of your functions against both sets of data. You are welcome to copy and modify check_solution.py to better suit your development and debugging workflow, but you will have the best chance of success with the hidden data set used for grading if your final submission passes all the tests in the unmodified check_solution.py.This project is designed to work in the class VM where the BGPStream libraries are installed. Your code will need to run without modification in the course VM.Required Background For this project, we will be using BGPStream, an open-source software framework for live and historical BGP data analysis, supporting scientific research, operational monitoring, and postevent analysis. BGPStream and PyBGPStream are maintained by the Center for Applied Internet Data Analysis (CAIDA).Read the resources A high-level overview about how the BGPStream tool was developed was published by CAIDA in BGPStream: A Software Framework for Live and Historical BGP Data Analysis. This paper provides useful background and practical examples using BGPStream, so be sure to read it. Additionally, you should read African peering connectivity revealed via BGP route collectors, which provides a practical illustration of how the BGP collection system works. Run Example Code Snippets All the tasks are to be implemented using the Python interface to BGPStream. You are strongly encouraged to browse the following resources to familiarize yourself with the tool, and to run the example code snippets: – PyBGPStream API: https://bgpstream.caida.org/docs/api/pybgpstream – PyBGPStream API Tutorial: https://bgpstream.caida.org/docs/tutorials/pybgpstream – PyBGPStream Repository: https://github.com/CAIDA/pybgpstream – Official Examples: https://github.com/CAIDA/pybgpstream/tree/master/examples Important Note As will become apparent when you peruse the above documentation and tutorial information, the majority of BGPStream use cases involve gathering data – either live or historical – directly from the Route Collectors (which we refer to simply as “collectors”). The code for accessing a collector or set of collectors directly usually looks like this:)Each of the parameters to pybgpstream.BGPStream() winnows the data retrieved from the collector(s). Because we are using pre-cached historical data in this project, you will not need to specify a collector or a time range. You also don’t need to use any additional filtering. For this project, you can use set up and configure your streams with:stream = pybgpstream.BGPStream(data_interface=”singlefile”) stream.set_data_interface_option(“singlefile”, type, fpath)where type is one of [“rib-file”, “upd-file”] and fpath is a string representing the path to a specific cache file. When processing multiple files, you will create one stream per file.Familiarize Yourself with the BGP Record Format and BGP Attributes It is critical that you understand the BGP record format, especially the meaning and content of the fields (data attributes). A detailed explanation of BGP records and attributes can be found in RFC 4271: A Border Gateway Protocol 4 (BGP-4). It’s also worth spending some time exploring the provided data using the BGPReader command line tool (“a command line tool that prints to standard output information about the BGP records and the BGP elems that are part of a BGP stream”). Doing so will be particularly helpful in understanding how the fields described in RFC 4271 and elsewhere map to the BGP record and BGP elem concepts used by BGPStream and PyBGPStream. Because PyBGPStream allows you to extract the BGP attributes from BGP records using code, you will not have to interact with the BGP records in this format, but it is, nevertheless, helpful to see some examples using BGPReader to understand the fields. The next section shows Here, we will show sample command line output from BGPReader for illustration purposes: # read records from an update file, filtering for IPv4 only bgpreader -e –data-interface singlefile –data-interface-option upd-file=./rrc04/update_files/ris.rrc04.updates.1609476900.300.cache –filter ‘ipv 4’# read records from a rib file, filtering for IPv4 only bgpreader -e –data-interface singlefile –data-interface-option rib-file=./rrc04/rib_files/ris.rrc04.ribs.1262332740.120.cache –filter ‘ipv 4’Update Example The box below contains an example of an update record. In the record, the “|” character separates different fields. In yellow we have highlighted the type (A stands for Advertisement), the advertised prefix (210.180.224.0/19), the path (11666 3356 3786), and the origin AS (3786). update|A|1499385779.000000|routeviews|routeviews.eqix|None|None|11666|206.126.236.24|210.180.224.0/19|206. 126.236.24|11666 3356 3786|11666:1000 3356:3 3356:2003 3356:575 3786:0 3356:22 11666:1002 3356:666 3356:86|None|NoneRIB Example The following is a Routing Information Base (RIB) record example. Consecutive “|” characters indicate fields without data. R|R|1445306400.000000|routeviews|route- views.sfmix|||32354|206.197.187.5|1.0.0.0/24|206.197.187.5|3235 4 15169|15169|||Setup Do not rely on the directory layout of the provided data. Gradescope does not mirror the directory layout from the provided files. Specifically, in your final submission, do not directly access the filesystem in any way and do not import all or part of either os or pathlib. All filesystem interaction will occur via PyBGPStream and the file paths will be taken from the Python list in the parameter named cache_files that is passed to each function.Cache Files / SnapshotsEach of the cache files is a snapshot of BGPM data collected by the collector at the time of the timestamp. In the rest of this assignment the term “snapshot” refers to the data in a particular cache file. Do not pull your own data. Your solution will be graded using cached data only.You will need to write code to process the cache files. Each entry in cache_files is a string containing the full path to a cache file. To access a given path, your code will need to set up the appropriate data interface in your BGPStream() constructor: stream = pybgpstream.BGPStream(data_interface=”singlefile”) stream.set_data_interface_option(“singlefile”, type, fpath)where type is one of [“rib-file”, “upd-file”] and fpath is a string representing the path to a specific cache file. When processing multiple files, you will create one stream per file. Tip: Your code shouldn’t make assumptions about the number of cache files.Task 1. Understanding BGP Routing table Growth In this task you will measure the growth over time of Autonomous Systems and advertised prefixes. The growth of unique prefixes contributes to ever-growing routing tables handled by routers in the Internet core. As optional background reading, please read the seminal paper On Characterizing BGP Routing Table Growth.Task 1A: Unique Advertised Prefixes Over Time This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from cache files, measure the number of unique advertised prefixes over time. Each file is an annual snapshot. Calculate the number of unique prefixes within each snapshot by completing the function unique_prefixes_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Task 1B: Unique Autonomous Systems Over Time This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, measure the number of unique Autonomous Systems over time. Each file is an annual snapshot. Calculate the number of unique ASes within each snapshot by completing the function unique_ases_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Task 1C: Top-10 Origin AS by Prefix Growth This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, calculate the percentage growth in advertised prefixes for each AS over the entire timespan represented by the snapshots by completing the function top_10_ases_by_prefix_growth(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Consider each origin AS separately and measure the growth of the total unique prefixes advertised by that AS from its first appearance to its last appearance. To compute this, for each origin AS: 1. Identify the snapshots where the origin AS first appears and last appears in the dataset. Note: An AS is not guaranteed to appear in every snapshot, nor it is guaranteed to appear in the first and last snapshots. 2. Calculate the percentage increase of the advertised prefixes, using the identified first and the last snapshot appearances. For example, assuming 5 given cache files, let’s say AS X first appeared in the 2nd snapshot and last appeared in the 4th and advertised the following number of prefixes: [0, 124, 215, 512, 0] The percentage increase would then be: = 3.13 or 313% 3. Report the top 10 origin AS that experienced the largest growth, and sort those from smallest to largest. Note: There are no ties, so don’t worry about implementing tie-breaking.Task 2: Routing Table Growth: AS-Path Length Evolution Over Time In this task you will measure if an AS is reachable over longer or shorter path lengths as time progresses. Towards this goal you will measure the AS path lengths, and how they evolve over time. This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, calculate the shortest path for each origin AS in each snapshot by completing the function shortest_path_by_origin_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py. For each snapshot, you will compute the shortest AS path length for each origin AS in the snapshot by following the steps below: – Identify each origin AS present in the snapshot. For example, given the path “11666 3356 3786”, “3786” is the origin AS. – For each origin AS, identify all the paths for which it appears as the origin AS. – Compute the length of each path by considering each AS in the path only once. In other words, you want to remove the duplicate entries for the same AS in the same path and count the total number of unique AS in the path. – Example: Given the path “25152 2914 3786 2914 18313”, ”18313” is the origin AS and ”2914” appears twice in the path. This is a path of length 4. – Among all the paths for an AS within the snapshot, compute the shortest path length. – Filter out all paths of length 1. a. If an AS path has a single unique AS or a single repeated AS (e.g., “25152 25152 25152”), the path has length 1 and should be ignored b. An AS path entry that looks like “{2914,14265}” is an aggregate or AS_SET and constitutes a single AS path entry. It does not need to be parsed in any way. You can read more about aggregation in RFC 4271. Example: The length of the AS path “25152 2914 18687 {2914,14265} 2945 18699” is 6. Example: The length of the AS path “25152 2914 18687 18687 {18687}” is 4. The entries “18687” and “{18687}” are distinct, so you only deduplicate “18687”. c. You can ignore all other corner cases.Task 3: Announcement-Withdrawal Event Durations In this task, we will measure how long prefix Announcements last before they are withdrawn. This matters because, when a prefix gets Advertised and then Withdrawn, this information propagates and affects the volume of the associated BGP traffic. Optional background reading on this topic can be found in The Shape of a BGP Update.This task will use cache files from the update_files subdirectories. These are update files, so you will pass “upd-file” in your call to set_data_interface_option(). Using the data from the cache files, we will measure how long prefix Announcements last before they are withdrawn by completing the function aw_event_durations(). Make sure that your function returns the data structure exactly as specified in bgpm.py.In defining Announcement Withdrawal (AW) events, we will only consider explicit withdrawals. An explicit withdrawal occurs when a prefix is advertised with an (A)nnouncement and is then (W)ithdrawn. In contrast, an implicit withdrawal occurs when a prefix is advertised (A) and then re-advertised (A) – usually with different BGP attributes.To compute the duration of an Explicit AW event for a given peerIP/prefix, you will need to monitor the stream of (A)nnouncements and (W)ithdrawals separately per peerIP/prefix pair.– Example: Given the stream: A1 A2 A3 W1 W2 W3 W4 for a specific peerIP/prefix pair, you have an implicit withdrawal A1-A2, another implicit withdrawal A2-A3, and, finally, an explicit withdrawal (and AW event) A3-W1. W1-W2, W2-W3, and W3-W4 are all meaningless, as there’s no active advertisement. The duration of the AW event is the time difference between A3 and W1. Again, we are only looking for last A and first W. – Example: Given the stream: A1 A2 A3 W1 W2 W3 W4 A4 A5 W4 for a specific peerIP/prefix pair, we have two AW events at A3-W1 and A5-W4. – We consider only non-zero AW durations. Task 4: RTBH Event Durations In this task you will identify and measure the duration of Real-Time Blackholing (RTBH) events.You will need to become familiar with Blackholing events. Good resources for this include RFC 7999, Section 2, BGP communities: A weapon for the Internet (Part 2), and the video Nokia – SROS: RTBH – Blackhole Community.This task will use cache files from the update_files_blackholing subdirectories. These are update files, so you will pass “upd-file” in your call to set_data_interface_option(). Using the data from the cache files, we will identify events where prefixes are tagged with a Remote Triggered Blackholing (RTBH) community and measure the time duration of the RTBH events by completing the function rtbh_event_durations(). Make sure that your function returns the data structure exactly as specified in bgpm.py.The duration of an RTBH event for a given peerIP/prefix pair is the time elapsed between the last (A)nnouncement of the peerIP/prefix that is tagged with an RTBH community value and the first (W)ithdrawal of the peerIP/prefix. In other words, we are looking at the stream of Announcements and Withdrawals for a given peerIP/prefix and identifying only explicit withdrawals for an RTBH tagged peerIP/prefix.To identify and compute the duration of an RTBH event for a given peerIP/prefix, you will need to monitor the stream of (A)nnouncements and (W)ithdrawals separately per peerIP/prefix pair.– Example: Given the stream: A1 A2 A3(RTBH) A4(RTBH) W1 W2 W3 W4 for a specific peerIP/prefix pair, A4(RTBH)-W1 denotes an RTBH event and the duration is calculated by taking the time difference between A4(RTBH) and W1. – Note: There can be more than one RTBH event in a given stream. For example, in the stream A1 A2 A3(RTBH) A4(RTBH) W1 W2 W3 W4 A5(RTBH) W5, there are two RTBH events: A4(RTBH)-W1 and A5(RTBH)-W5. – Example: Given the stream A1 A2 A3(RTBH) A4 A5 W1 W2 for a specific peerIP/prefix pair, the announcement A3(RTBH) followed by A4 is an implicit withdrawal. There is no explicit withdrawal and, thus, no RTBH event. – In case of duplicate announcements, use the latest. – Consider only non-zero duration events.Submission Submit bgpm.py to Gradescope. Grading Rubric Points Task to be completed 10 Task 1A 10 Task 1B 10 Task 1C30 Task 220 Task 320 Task 4https://policylibrary.gatech.edu/student-affairs/academic-honor-codeWe strictly enforce Section 3. Student Responsibilities including these prohibited actions: – Unauthorized Access: Possessing, using, or exchanging improperly acquired written or verbal information in the preparation of a problem set, laboratory report, essay, examination, or other academic assignment. – Unauthorized Collaboration: Unauthorized interaction with another Student or Students in the fulfillment of academic requirements. – False Claims of Performance: False claims for work that has been submitted by a Student.Official resources and those referenced in the project document such as the official Python documentation, official CAIDA documentation, code examples, repositories, etc. do not need to be cited. If you reference unofficial coding/programming resources such as W3Schools, Stack Overflow, etc. please cite them in your code.

$25.00 View

[SOLVED] Cs6250 – bgp measurements project summer 2025

Table of Contents Motivation 2 Introduction 2 Project Overview and Background 3 Required Background 3 Read the resources 4 Run Example Code Snippets 4 Important Note 4 Familiarize Yourself with the BGP Record Format and BGP Attributes 5 Update Example 6 RIB Example 6 Setup 7 Cache Files / Snapshots 7 Task 1. Understanding BGP Routing table Growth 8 Task 1A: Unique Advertised Prefixes Over Time 8 Task 1B: Unique Autonomous Systems Over Time 8 Task 1C: Top-10 Origin AS by Prefix Growth 9 Task 2: Routing Table Growth: AS-Path Length Evolution Over Time 10 Task 3: Announcement-Withdrawal Event Durations 12 Task 4: RTBH Event Durations 13 Submission 15 Grading Rubric 15Motivation In this assignment, we will explore Internet Measurements, a field of Computer Networks which focuses on large scale data collection systems and techniques that provide us with valuable insights and help us understand (and troubleshoot) how the Internet works. There are multiple systems and techniques that focus on DNS measurements, BGP measurements, topology measurements, etc. There are multiple conferences in this area, which we invite you to explore and keep up with the papers that are published. The IMC conference is one of the flagship conferences in this area: ACM Internet Measurement ConferenceA gentle introduction into the Internet Measurement field is to work with large scale BGP measurements and data to study topics such as: • Characterizing growth of the Internet using various measures, such as number of advertised prefixes, the number of Autonomous Systems, the percentage growth of prefixes advertised by Autonomous System, and the dynamics of Autonomous System path lengths • Inferring problems related to short-lived Announcement and Withdrawals, • Inferring possible DDoS attacks by identifying community countermeasures such as “Remote Triggered Blackholing” Introduction In this project we will use the BGPStream tool and its Python interface PyBGPStream to understand the BGP protocol and interact with BGP data. The goal is to gain a better understanding of BGP and to experience how researchers, practitioners, and engineers have been using BGPStream to gain insight into the dynamics of the Internet. If you are interested in going deeper, you can use these same tools to observe and analyze real-time BGP data or download and analyze other historical BGP data.Project Overview and Background The zip file accompanying this assignment contains the code and data needed to implement the functions in the file bgpm.py. You will submit only bgpm.py to Gradescope and all your code for the project must be contained within bgpm.py.This project description, in combination with the comments in bgpm.py, comprise the complete requirements for the project. There are two complete sets of data included in the zip file and the provided test harness in check_solution.py will test each of your functions against both sets of data. You are welcome to copy and modify check_solution.py to better suit your development and debugging workflow, but you will have the best chance of success with the hidden data set used for grading if your final submission passes all the tests in the unmodified check_solution.py.This project is designed to work in the class VM where the BGPStream libraries are installed. Your code will need to run without modification in the course VM.Required Background For this project, we will be using BGPStream, an open-source software framework for live and historical BGP data analysis, supporting scientific research, operational monitoring, and postevent analysis. BGPStream and PyBGPStream are maintained by the Center for Applied Internet Data Analysis (CAIDA).Read the resources A high-level overview about how the BGPStream tool was developed was published by CAIDA in BGPStream: A Software Framework for Live and Historical BGP Data Analysis. This paper provides useful background and practical examples using BGPStream, so be sure to read it. Additionally, you should read African peering connectivity revealed via BGP route collectors, which provides a practical illustration of how the BGP collection system works. Run Example Code Snippets All the tasks are to be implemented using the Python interface to BGPStream. You are strongly encouraged to browse the following resources to familiarize yourself with the tool, and to run the example code snippets: – PyBGPStream API: https://bgpstream.caida.org/docs/api/pybgpstream – PyBGPStream API Tutorial: https://bgpstream.caida.org/docs/tutorials/pybgpstream – PyBGPStream Repository: https://github.com/CAIDA/pybgpstream – Official Examples: https://github.com/CAIDA/pybgpstream/tree/master/examples Important Note As will become apparent when you peruse the above documentation and tutorial information, the majority of BGPStream use cases involve gathering data – either live or historical – directly from the Route Collectors (which we refer to simply as “collectors”). The code for accessing a collector or set of collectors directly usually looks like this:)Each of the parameters to pybgpstream.BGPStream() winnows the data retrieved from the collector(s). Because we are using pre-cached historical data in this project, you will not need to specify a collector or a time range. You also don’t need to use any additional filtering. For this project, you can use set up and configure your streams with:stream = pybgpstream.BGPStream(data_interface=”singlefile”) stream.set_data_interface_option(“singlefile”, type, fpath)where type is one of [“rib-file”, “upd-file”] and fpath is a string representing the path to a specific cache file. When processing multiple files, you will create one stream per file.Familiarize Yourself with the BGP Record Format and BGP Attributes It is critical that you understand the BGP record format, especially the meaning and content of the fields (data attributes). A detailed explanation of BGP records and attributes can be found in RFC 4271: A Border Gateway Protocol 4 (BGP-4). It’s also worth spending some time exploring the provided data using the BGPReader command line tool (“a command line tool that prints to standard output information about the BGP records and the BGP elems that are part of a BGP stream”). Doing so will be particularly helpful in understanding how the fields described in RFC 4271 and elsewhere map to the BGP record and BGP elem concepts used by BGPStream and PyBGPStream. Because PyBGPStream allows you to extract the BGP attributes from BGP records using code, you will not have to interact with the BGP records in this format, but it is, nevertheless, helpful to see some examples using BGPReader to understand the fields. The next section shows Here, we will show sample command line output from BGPReader for illustration purposes: # read records from an update file, filtering for IPv4 only bgpreader -e –data-interface singlefile –data-interface-option upd-file=./rrc04/update_files/ris.rrc04.updates.1609476900.300.cache –filter ‘ipv 4’# read records from a rib file, filtering for IPv4 only bgpreader -e –data-interface singlefile –data-interface-option rib-file=./rrc04/rib_files/ris.rrc04.ribs.1262332740.120.cache –filter ‘ipv 4’Update Example The box below contains an example of an update record. In the record, the “|” character separates different fields. In yellow we have highlighted the type (A stands for Advertisement), the advertised prefix (210.180.224.0/19), the path (11666 3356 3786), and the origin AS (3786). update|A|1499385779.000000|routeviews|routeviews.eqix|None|None|11666|206.126.236.24|210.180.224.0/19|206. 126.236.24|11666 3356 3786|11666:1000 3356:3 3356:2003 3356:575 3786:0 3356:22 11666:1002 3356:666 3356:86|None|NoneRIB Example The following is a Routing Information Base (RIB) record example. Consecutive “|” characters indicate fields without data. R|R|1445306400.000000|routeviews|route- views.sfmix|||32354|206.197.187.5|1.0.0.0/24|206.197.187.5|3235 4 15169|15169|||Setup Do not rely on the directory layout of the provided data. Gradescope does not mirror the directory layout from the provided files. Specifically, in your final submission, do not directly access the filesystem in any way and do not import all or part of either os or pathlib. All filesystem interaction will occur via PyBGPStream and the file paths will be taken from the Python list in the parameter named cache_files that is passed to each function.Cache Files / SnapshotsEach of the cache files is a snapshot of BGPM data collected by the collector at the time of the timestamp. In the rest of this assignment the term “snapshot” refers to the data in a particular cache file. Do not pull your own data. Your solution will be graded using cached data only.You will need to write code to process the cache files. Each entry in cache_files is a string containing the full path to a cache file. To access a given path, your code will need to set up the appropriate data interface in your BGPStream() constructor: stream = pybgpstream.BGPStream(data_interface=”singlefile”) stream.set_data_interface_option(“singlefile”, type, fpath)where type is one of [“rib-file”, “upd-file”] and fpath is a string representing the path to a specific cache file. When processing multiple files, you will create one stream per file. Tip: Your code shouldn’t make assumptions about the number of cache files.Task 1. Understanding BGP Routing table Growth In this task you will measure the growth over time of Autonomous Systems and advertised prefixes. The growth of unique prefixes contributes to ever-growing routing tables handled by routers in the Internet core. As optional background reading, please read the seminal paper On Characterizing BGP Routing Table Growth.Task 1A: Unique Advertised Prefixes Over Time This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from cache files, measure the number of unique advertised prefixes over time. Each file is an annual snapshot. Calculate the number of unique prefixes within each snapshot by completing the function unique_prefixes_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Task 1B: Unique Autonomous Systems Over Time This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, measure the number of unique Autonomous Systems over time. Each file is an annual snapshot. Calculate the number of unique ASes within each snapshot by completing the function unique_ases_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Task 1C: Top-10 Origin AS by Prefix Growth This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, calculate the percentage growth in advertised prefixes for each AS over the entire timespan represented by the snapshots by completing the function top_10_ases_by_prefix_growth(). Make sure that your function returns the data structure exactly as specified in bgpm.py.Consider each origin AS separately and measure the growth of the total unique prefixes advertised by that AS from its first appearance to its last appearance. To compute this, for each origin AS: 1. Identify the snapshots where the origin AS first appears and last appears in the dataset. Note: Don’t make assumptions about when an AS can appear – an AS is not guaranteed to appear in every snapshot, nor is it guaranteed to appear in the first and last snapshots. 2. Calculate the percentage increase of the advertised prefixes, using the identified first and the last snapshot appearances. For example, assuming 5 given cache files, let’s say AS X first appeared in the 2nd snapshot and last appeared in the 4th and advertised the following number of prefixes: 0, 124, 215, 512, 0 The percentage increase would then be: = 3.13 or 313% 3. Report the top 10 origin AS that experienced the largest growth, and sort those from smallest to largest. Note: There are no ties, so don’t worry about implementing tie-breaking.Task 2: Routing Table Growth: AS-Path Length Evolution Over Time In this task you will measure if an AS is reachable over longer or shorter path lengths as time progresses. Towards this goal you will measure the AS path lengths, and how they evolve over time. This task will use cache files from the rib_files subdirectories. These are RIB files, so you will pass “rib-file” in your call to set_data_interface_option(). Using the data from the cache files, calculate the shortest path for each origin AS in each snapshot by completing the function shortest_path_by_origin_by_snapshot(). Make sure that your function returns the data structure exactly as specified in bgpm.py. For each snapshot, you will compute the shortest AS path length for each origin AS in the snapshot by following the steps below: – Identify each origin AS present in the snapshot. For example, given the path “11666 3356 3786”, “3786” is the origin AS. – For each origin AS, identify all the paths for which it appears as the origin AS. – Compute the length of each path by considering each AS in the path only once. In other words, you want to remove the duplicate entries for the same AS in the same path and count the total number of unique AS in the path. – Example: Given the path “25152 2914 3786 2914 18313”, ”18313” is the origin AS and ”2914” appears twice in the path. This is a path of length 4. – Among all the paths for an AS within the snapshot, compute the shortest path length. – Filter out all paths of length 1. a. If an AS path has a single unique AS or a single repeated AS (e.g., “25152 25152 25152”), the path has length 1 and should be ignored b. An AS path entry that looks like “{2914,14265}” is an aggregate or AS_SET and constitutes a single AS path entry. It does not need to be parsed in any way. You can read more about aggregation in RFC 4271. Example: The length of the AS path “25152 2914 18687 {2914,14265} 2945 18699” is 6. Example: The length of the AS path “25152 2914 18687 18687 {18687}” is 4. The entries “18687” and “{18687}” are distinct, so you only deduplicate “18687”. c. You can ignore all other corner cases.Task 3: Announcement-Withdrawal Event Durations In this task, we will measure how long prefix Announcements last before they are withdrawn. This matters because, when a prefix gets Advertised and then Withdrawn, this information propagates and affects the volume of the associated BGP traffic. Optional background reading on this topic can be found in The Shape of a BGP Update.This task will use cache files from the update_files subdirectories. These are update files, so you will pass “upd-file” in your call to set_data_interface_option(). Using the data from the cache files, we will measure how long prefix Announcements last before they are withdrawn by completing the function aw_event_durations(). Make sure that your function returns the data structure exactly as specified in bgpm.py.In defining Announcement Withdrawal (AW) events, we will only consider explicit withdrawals. An explicit withdrawal occurs when a prefix is advertised with an (A)nnouncement and is then (W)ithdrawn. In contrast, an implicit withdrawal occurs when a prefix is advertised (A) and then re-advertised (A) – usually with different BGP attributes.To compute the duration of an Explicit AW event for a given peerIP/prefix, you will need to monitor the stream of (A)nnouncements and (W)ithdrawals separately per peerIP/prefix pair.– Example: Given the stream: A1 A2 A3 W1 W2 W3 W4 for a specific peerIP/prefix pair, you have an implicit withdrawal A1-A2, another implicit withdrawal A2-A3, and, finally, an explicit withdrawal (and AW event) A3-W1. W1-W2, W2-W3, and W3-W4 are all meaningless, as there’s no active advertisement. The duration of the AW event is the time difference between A3 and W1. Again, we are only looking for last A and first W. – Example: Given the stream: A1 A2 A3 W1 W2 W3 W4 A4 A5 W4 for a specific peerIP/prefix pair, we have two AW events at A3-W1 and A5-W4. – We consider only non-zero AW durations. Task 4: RTBH Event Durations In this task you will identify and measure the duration of Real-Time Blackholing (RTBH) events.You will need to become familiar with Blackholing events. Good resources for this include RFC 7999, Section 2, BGP communities: A weapon for the Internet (Part 2), and the video Nokia – SROS: RTBH – Blackhole Community.This task will use cache files from the update_files_blackholing subdirectories. These are update files, so you will pass “upd-file” in your call to set_data_interface_option(). Using the data from the cache files, we will identify events where prefixes are tagged with a Remote Triggered Blackholing (RTBH) community and measure the time duration of the RTBH events by completing the function rtbh_event_durations(). Make sure that your function returns the data structure exactly as specified in bgpm.py.The duration of an RTBH event for a given peerIP/prefix pair is the time elapsed between the last (A)nnouncement of the peerIP/prefix that is tagged with an RTBH community value and the first (W)ithdrawal of the peerIP/prefix. In other words, we are looking at the stream of Announcements and Withdrawals for a given peerIP/prefix and identifying only explicit withdrawals for an RTBH tagged peerIP/prefix.To identify and compute the duration of an RTBH event for a given peerIP/prefix, you will need to monitor the stream of (A)nnouncements and (W)ithdrawals separately per peerIP/prefix pair.– Example: Given the stream: A1 A2 A3(RTBH) A4(RTBH) W1 W2 W3 W4 for a specific peerIP/prefix pair, A4(RTBH)-W1 denotes an RTBH event and the duration is calculated by taking the time difference between A4(RTBH) and W1. – Note: There can be more than one RTBH event in a given stream. For example, in the stream A1 A2 A3(RTBH) A4(RTBH) W1 W2 W3 W4 A5(RTBH) W5, there are two RTBH events: A4(RTBH)-W1 and A5(RTBH)-W5. – Example: Given the stream A1 A2 A3(RTBH) A4 A5 W1 W2 for a specific peerIP/prefix pair, the announcement A3(RTBH) followed by A4 is an implicit withdrawal. There is no explicit withdrawal and, thus, no RTBH event. – In case of duplicate announcements, use the latest. – Consider only non-zero duration events.Submission Submit bgpm.py to Gradescope. Grading Rubric Points Task to be completed 10 Task 1A 10 Task 1B 10 Task 1C30 Task 220 Task 320 Task 4https://policylibrary.gatech.edu/student-affairs/academic-honor-codeWe strictly enforce Section 3. Student Responsibilities including these prohibited actions: – Unauthorized Access: Possessing, using, or exchanging improperly acquired written or verbal information in the preparation of a problem set, laboratory report, essay, examination, or other academic assignment. – Unauthorized Collaboration: Unauthorized interaction with another Student or Students in the fulfillment of academic requirements. – False Claims of Performance: False claims for work that has been submitted by a Student.Official resources and those referenced in the project document such as the official Python documentation, official CAIDA documentation, code examples, repositories, etc. do not need to be cited. If you reference unofficial coding/programming resources such as W3Schools, Stack Overflow, etc. please cite them in your code.

$25.00 View

[SOLVED] Cs6238 project 4

Secure Shared Store (3S) IMPORTANT NOTES: 1. We will not accept PyCrypto, Crypto, or Cryptodome libraries in project four. Use the cryptography library (https://cryptography.io/en/latest/) for this project. We will not give credit for any effort using the prohibited libraries. 2. The Video is mandatory, and the student will be deducted 10 points if the video is not submitted. Goals & Assumptions This project is based on the topic of distributed systems security that is covered in Modules 11 and 12. The goal of the project is to gain hands-on experience in implementing secure distributed services. You will develop a simple Secure Shared Store (3S) service that allows for the storage and retrieval of documents created by multiple users who access the documents at their local machines. In the implementation, the system should consist of one or more 3S client nodes and a single server that stores the documents. Users should be able to login to the 3S server through any client by providing their private key as discussed in Module 12. Session tokens would be generated upon successful authentication of the users. They can then check-in, checkout and delete documents as allowed by access control policies defined by the owner of the document. To implement such a distributed system, we will need to make use of certificates to secure communication between clients and the server, and to authenticate sources of requests. You will need to make use of a Certificate Authority (CA) that generates certificates for users, client nodes and the server. All nodes trust the CA.Project Setup We have provided a Virtual Machine for the project. Links to download the image (.ova file) will be posted on Ed Discussion. The default account on the VM is cs6238 and the password is cs6238. The root password is also cs6238. In an ideal setting, the 3S server and the client would be on separate nodes. For simplicity, we have set up only one VM. The server and client nodes are abstracted as separate folders within the VM. For example, the server folder represents the server and the client1 folder represents the client node. The desktop contains a Project4 folder which has the skeletal implementation of the 3S service. You will be required to complete the implementation to satisfy all the functionalities which will be detailed below. The Project4 folder contains: 1. CA – Represents the Certificate Authority and contains the CA certificates. 2. server – Represents the server. It contains server certificates and the 3S application code. The 3S server is implemented using Python Flask and server.py contains the outline of the server code which is to be fully completed. 3. client1 – Represents one of the client nodes. client.py has the skeletal implementation of the client. You will be required to generate client certificates and place them in the client1/certs folder. 4. client2 – Represents another client node and the environment should be similar to client1.Fig: Folder structure of Project4Certificates As discussed above, we will need to make use of a Certificate Authority that is trusted by all nodes. This CA would be used to generate certificates for the users, client nodes and the server. One can make use of a library such as OpenSSL for setting up the CA and to generate certificates. For this project, we have created a CA. This CA has been used to generate certificates for the server. You would be required to generate certificates for the client nodes using this CA. The CA (certificate and key) was generated using the password (passphrase) cs6238. Detailed instructions on generating certificates are present in Appendix A. When the client keys and certificates are created, they should be placed in the clientX/certs folder and should be named as clientX.key and clientX.crt3S Implementation Details After a 3S server starts, a client node can make requests to the server. Let’s assume that client nodes have a discovery service that allows them to find the hostname where 3S runs. The hostname, in this case, is secureshared-store. The certificate for the server contains secure-shared-store as the common name of the server. Whenever the client node makes a request, mutual authentication is performed, and a secure communication channel is established between the client node and the server. Here we make use of nginx to perform mutual authentication (MTLS). Every request from the client node should include the certificate of the client node for authentication. As mentioned before, the 3S service should enable functions such as login, checkin, checkout, grant, delete, and logout. You will have to complete the skeleton code provided for the server and client to achieve these functionalities. Details are as follows: 1. login(User UID, UserPrivateKey): This call allows a client node to generate necessary statements to convince the 3S server that requests made by the client are for the user having UID as its user-id. The client node will take UID and UserPrivate key as two separate inputs from the user. The filename of the key is to be provided as input as opposed to the key value itself. A user’s private key should only be used to sign the necessary statement, but never sent to the server. The statement should be of the form “ClientX as UserY logs into the Server” where X represents the client-id and Y represents the userid. On successful login, the server should return a unique session-token for the user. The session token will have to be included in all the subsequent requests and would play the role of the statement in those requests. Also, you must ensure that each user has a unique UID. You can assume that a given client node only handles requests of a single user in one session (if a user logs in successfully from another client, the previous session will be invalidated). Example of a public / private key creation with OpenSSL: https://www.digicert.com/kb/ssl-support/openssl-quick-reference-guide.htm.When the Security Flag is set as Confidentiality (to be represented by “1”), the server generates a random AES key for the document, uses it for encryption and stores data in the encrypted form. To decrypt the data at a later time, this key is also encrypted using the server’s public key and stored with document meta-data. When the Security Flag is set as Integrity (to be represented by “2”), the server stores the document along with a signed copy.3. checkout(Document DID): After a session is established, a user can use this function to request a specific document based on the document identifier (DID) over the secure channel to the server. • The request is granted only if the checkout request is made either by the owner of the document or if performed by a user who is authorized to perform this action. • If successful, a copy of the document is sent to the client node. • The server would have maintained information about documents (e.g., meta-data) during checkin that allows it to locate the requested document, decrypt it and send the document back to the requestor. • Once the document is checked out, it must be stored in the documents/checkout folder within the Client directory. When a request is made for a document stored with Confidentiality as the SecurityFlag, the server locates the encrypted document and its key, decrypts the data and sends it back over the secure channel. Similarly, when a request is made for a document stored with Integrity as the SecurityFlag, the signature of the document must be verified before sending a copy to the client. Additionally, when a request is made to checkin a document that is checked out in the current active session, the client must move (not copy) the document from the “/documents/checkout” folder into the “/documents/checkin” folder. The client implementation must handle the transfer of these files between the folders automatically. t 4. grant(Document DID, TargetUser TUID, AccessRight R, time T): a Grant can only be issued by the owner of the document. b This will change the defined access control policy to allow the target user (TUID) to have authorization for the specified action (R) for the specified document (DID). c AccessRight R can either be: i checkin (which must be represented by input 1) ii checkout (which must be represented by input 2) iii both (which must be represented by input 3) for time duration T (in seconds). If the TargetUser is ALL (TUID=0), the authorization is granted to all the users in the system for this specific document. If there are multiple grants that have been authorized for a particular document and user, the latest grant would be the effective rule. Basically, the latest grant for the tuple (DID, TUID) should persist. Here are a few clarification scenarios for Grant: − If an initial grant for (file1, user1, 2, 100) is successful and then a successful grant request (file1, 0, 1, 50) is made, then file1 should be accessible for checkin only to all users for 50 seconds. User1 loses the checkout access given earlier. − Grant (file1, 0, 3, 100) exists and then a successful grant request (file1, user2, 2, 50), then file1 is accessible to user2 for checkout for 50 seconds and invalidates the previous grant.5. delete(Document DID): If the user currently logged in at the requesting client is the document owner, the file is safely deleted. No one in the future should be able to access data contained in it even if the server gets compromised. The deletion of a confidential document should result in permanent removal of the key used to encrypt it.6. logout(): Terminates the current session. If any documents received from the server were modified, their new copies must be sent to the server before session termination completes. While checking back in the modified documents, you must set Integrity as the SecurityFlag.Since this is a security class, you should use secure coding practices. You are also expected to use static code analysis tools such as Pylint, Pyflakes, etc. and minimize the use of unsafe function calls (justify any such calls you need to make by providing inline comments). The report should list tools used to ensure that your code does not have any vulnerabilities. The report should also discuss the threat model and what threats are handled by your implementation.Fig. Project FlowProject Deliverables 1. Report. It should cover the following aspects (Each answer need not be more than a few sentences): ● Architectural design details: − How mutual authentication is achieved in the current implementation of 3S. − Details on the cryptographic libraries and functions used to handle secure file storage. − How the user information and metadata related to documents were stored. ● Implementation details: − Details of how the required functionalities were implemented − List the assumptions made, if any. ● Results of the static code analysis and the tools used. ● Threat Modelling and the threats currently handled by your implementation. ● Your report should be named as Report.pdf 2. Server code ● This will be the completed version of server.py that was provided. ● This must be named as server.py 3. Client node ● This will be the completed version of client.py that was provided. ● This must be named as client.py 4. Requirements • This should include all additional python modules used in your implementation. • Please add any additional python libraries in a file with the name: “requirements.txt” to generate the file. This will be used by the auto grader to replicate your environment.Please ensure that you do not zip the files in your submission. Also, please stick to the specified naming conventions since an auto grader would be evaluating your submissions. IMPORTANT: Please ensure that you submit only these 4 files along with the video (See Video Requirements below) that are mentioned and follow the specified naming conventions. Any error in adhering to these guidelines would result in an error with the autograder and would result in a significant loss of points.Additional Instructions ● Please go through the comments in server.py and client.py and follow the provided instructions. Ensure to complete the sections where TODOs are specified. You can add utility functions as required. ● All requests sent from the client must make use of the post_request() utility function and do not modify this function. A sample response format is given in the login() function within the server code. Feel free to make use of the same for the other server function. ● Expected response status codes for different scenarios are provided in the server.py for each function. Ensure that the completed code behaves accordingly since the auto grader would use the status for verification. (Status codes provided are custom status codes and not the standard ones. Using custom status codes in HTTP is not best practice, however, this is being used for the purpose of the auto grader) ● Failure to follow the provided instructions would result in unnecessary point loss. ● Run the script start_server.sh present in the server directory to start the server. It essentially invokes server.py to start the server. ● Make sure that the status of nginx is active by using the command – systemctl status nginx. If the status is not active, you can restart it using the command – sudo systemctl restart nginx ● Ensure that your implementation can be run and tested by just invoking the start_server.sh and client.py. Your code must also automatically initialize the required databases, if any and this is necessary so that the autograder can successfully be run. Additionally, ensure that your code would create any folders as necessary by your implementation since we would be using just the two python scripts for evaluation. ● No specific test cases will be provided for this project, and you are free to develop a test harness that consists of a sequence of calls made to the 3S server. However, we will soon be releasing a basic testing script to give you an idea of how inputs are provided to the client. ● Make sure to test your 3S implementation using at least 2 clients and 3 users. The autograder would be tested with users named ‘user1’, ‘user2’, ‘user3’ [these are the UIDs] and clients named ‘client1’, ‘client2’. Please make sure your implementation would be able to support these. The auto grader would also expect userX.key and userX.pub as the private and public keys for those users. So, ensure to test with these three users and create any metadata (in your application and database) as required to support this mapping with the necessary files, keys or paths. IMPORTANT: Do not hardcode the public or private key names (eg: user1.key or user1.pub) in your code. Make sure the usernames and keys are all in lowercase only. ● When the client keys and certificates are created, they should be placed in the clientX/certs folder and should be named as clientX.key and clientX.crt (this is an important setup step.). These must be used in client.py when the post_request() is invoked. ● While sending requests some of you might encounter SSL errors and to avoid this issue, please have a look at the python package certifi for this. (You can use this link as a reference – https://incognitjoe.github.io/adding-certs-to-requests.html) ● We encourage you all to discuss the project at a high-level on Ed Discussion. Please ensure that you are not over-sharing and maintain academic honesty. Halfway through the project, if there are many common doubts, we will consolidate the clarification posts and share it as a note.Grading Outline Report – 30 points ● Architectural design details – 5 points ● Implementation details – 15 points ● Security Analysis of the implemented secure shared store – 5 points ● Threat Modelling – 5 points Implementation of 3S – 70 points Each function in the implementation will be scored as below: 1. Login – 10 points ● Handling the private key of the user and verifying the signature of the created statements ● Generation of a session token used for further requests. 2. Checkin – 15 points ● Secure file transfer of documents ● Handling the security flag – Integrity and Confidentiality ● Ownership/ Authorization check 3. Checkout – 15 points ● Secure file transfer of documents ● Handling the security flag – Integrity and Confidentiality ● Ownership/ Authorization check 4. Grant – 15 points ● Granting authorization to other users ● Handling expiry of granted access (in seconds) 5. Delete – 10 points. ● Ensuring deletion of files ● Ownership/ Authorization check 6. Logout – 5 points ● Checking in all the modified checked out files and session termination.Video Requirements The following steps will be required to be shown as a part of your video: ● Download your latest submission from Canvas. ● Walk through the 6 functions that are mentioned as a part of the Implementation requirements ● Follow these steps when recording the video: 1. Login as user1 with user1.key (Success) 2. Login as user2 with user2.key (Success) 3. Login as user3 with user1.key (Fail) 4. user1 checkin file1 with Security Flag (Success) 5. user2 checkin file2 with Integrity Flag (Success) 6. user1 checkout file1 (Success) 7. user2 checkout file2 (Success) 8. user1 checkout file2 (Fail) 9. user2 checkout file1 (Fail) 10. user1 grant checkout file1 to user2 (Success) 11. user2 checkout file 1 (within the granted time for step 10) (Success) 12. user3 checkout file 1 (within the granted time for step 10) (Fail) 13. Wait for the granted period in step 10 to expire and try 11 again. (Fail) 14. user1 delete file 1 (Success) 15. user1 delete file 2 (Fail) 16. user2 delete file 2 (Success) 17. Logout (Success) The video should show the file locations and content. Try to show as many details about the functionality of the program as possible. ● The entire duration of this video should not exceed 10 minutes, but we are flexible.APPENDIX A Certificate Generation: The resource below describes how to set up a Certificate Authority (CA) and then how it’s certificate would be used to generate certificates for the nodes. ● https://deliciousbrains.com/ssl-certificate-authority-for-local-https-development/ We have already set up a CA. You can find the CA certificates in the CA folder of Project4. We have also generated the server keys and certificate (certname is secure-shared-store) using the CA certificate. Also, the following command was used to extract the public key from the certificate. openssl x509 -pubkey -noout -in secure-shared-store.crt > secure-shared-store.pub You can use the above resources to generate certificates and keys for the client nodes and users.

$25.00 View

[SOLVED] Cs6238 project ii -password hardening with 2fa

Password Hardening with 2FALearning Objectives: The goal of this project is to harden password-based authentication by including information obtained from a second factor (two-factor Authentication or 2FA). We will use the Linux login command implementation to explore this. Although somewhat contrived, the motivation for the 2FA scheme explored in this project is similar to the password hardening paper discussed in course lectures. More specifically, information maintained by the system to check the validity of a login request is updated after each login request to limit the effectiveness of offline guessing attacks. The following are the learning objectives of this project. 1. Understand how password-based authentication is implemented. 2. Understand the benefits of multiple factors for stronger authentication. 3. Augment a password-based login command code to include an additional factor obtained from a second source. 4. Analyze security benefits (or lack of them) of the 2FA implementation. To keep it simple, this project focuses only on hardening the basic login scheme for a desktop/laptop system. However, this scheme can be extended to provide password hardening for remote logins. Project Setup: Note: The link to the VM will be posted on an Ed Discussion pinned post. For this project, you will be provided with an Ubuntu-based Virtual Machine (VM). This VM was tested on Oracle VM Virtual box 7.0 and can be directly imported to it . This VM has a default account “cs6238” setup with normal user access privileges. You will use root, to access a file as root, open a terminal, type “sudo su”, enter the “cs6238” password, and then access the file. Password for the “cs6238” account is “cs6238”. You should not include “” while entering the password. When you log into account “cs6238”, follow these instructions: 1. cd /home/cs6238/DesktopAs you can see from Fig 1., the file ‘/etc/shadow’ can be accessed as a root user.Fig 1. Reading file ‘/etc/shadow’ from rootFig2. Desktop folder This folder contains: • 2 Python code files, check_login.py and create_user.py, • and one executable, token_generator. Additional details of these code files and the executable will be described later. IMPORTANT NOTE: We have observed that students tend to erase or delete the /etc/passwd and /etc/shadow file while working on this project and lose access for login into the VM. It would be safe to take snapshots of your VM before starting the project and while progressing through the project. We have also created a copy of the files namely /etc/passwd.cs6238 and /etc/shadow.cs6238 on the VM in case you delete or erase or modify them. Since you have root access it would be safe to have your own copy and exercise caution while updating these files. Prior to starting on the project, you should familiarize yourself with the working of the login command in Linux. In particular, you should be able to answer the following questions. 1. How are users created in Linux? 2. What algorithms are used in the process of encryption/hashing of passwords? 3. Where is information derived from passwords stored? 4. How does the system check if a correct password is provided by a login request? 5. Why and how is salt used? 6. Who has access to the file containing password related information? There are plenty of online resources for finding answers to these questions. To help you get started, see the section “GETTING STARTED ON LINUX LOGIN/PASSWORDS” in the Appendix Section.1. Project Details Getting Started: To help you get started, we have provided two Python code files to help you better understand the inner working of the system while creating and logging in users. 1. create_user.py: to create a new user. You should study its code to understand what it does, and what are the requirements that should be met to run it successfully. Check what changes are made to the /etc/shadow file after creating a user with create_user.py. A good understanding of these changes will help you later in this project.2. check_login.py: which checks whether the user can login using a pair of user-id and password values. By analyzing the working of its code, you can learn how a user is validated after providing the correct password. In particular, explore how it retrieves the hash from the shadow file and checks it against the hash generated by the password provided by the user. After understanding the code of create_user.py and check_login.py, you are ready to work on this project. 2. Task1: Implementing 2FA (80% of grade): In this task, you need to implement 2FA using the provided token generator (TG) executable, which serves as a second factor. The 2FA method uses the tokens generated by TG to harden the login mechanism used in Linux. Thus, each user has two accounts: • One account on the 2FA system • One account on TG The Token Generator and the 2FA Method are described first. Details regarding what you must implement are provided in Implementation of 2FA. Token Generator (TG) Before moving to the 2FA method, it is important to know the working of TG. It gives a user three options – ‘1’ for registering a new user, ‘2’ for generating token for the current/registered user and ‘3’ for deleting an existing user account. NOTE: You have been provided with the Token Generator (TG) executable. You only have to understand how it works so that you can use it as a black box in your project. You do not have to implement the Token Generator. If user enters: ‘1’ then • TG will prompt the user for user-id and a six-digit PIN. After this, • It will generate an initial token, IT, and create an encrypted text file with entered user-id in its name. • Deleting this generated file is equivalent to deleting the account of the user from TG. We will see how this initial token IT will help us when we discuss the 2FA method. ‘2’ then • user will have to provide user-id and correct PIN for user U (think of this as unlocking the screen on your phone with a PIN or pattern when the phone is the 2FA device). • If correct information is provided, the token generator will return current token CT and next token NT. • These tokens will be used in the 2FA method. ‘3’ then • user will have to provide the same information as in option ‘2’(user-id and correct PIN) • the TG will only produce current token CT and will delete the user account and associated file. NOTE (Very Important!!!): After execution of each option in TG, the user will be prompted for confirmation of the requested task in the 2FA method. If the task in 2FA method, for which tokens are generated using TG is completed successfully, the user must enter ‘y’ or ‘Y’. If the user enters some other character, then the TG will revert itself to the previously known state for the user.The 2FA Method 1. Create a User Account: First, when user U tries to create an account, 2FA login method requires a user to provide three things, username U, password P (confirm password), salt, and the initial token IT generated by the TG when registering a user account for user. NOTE: The PIN for the TG and the password for 2FA system should be different. But the username for the 2FA system and the user-id for the TG should be the same. The 2FA login method will take this token IT, concatenate it with the provided password P and this will be the hardened password (P+IT) that goes into the password hashing algorithm to generate an entry in the shadow file. With this entry, the new user will be successfully created. The whole process can be visualized as shown in Figure 2[Appendix Section]: 2. Logging into created User Account After a user U is created, he/she can log into his/her account. For login, a user must provide username U, password P, current token CT from the token generator and next token NT from the token generator . 2FA method first checks if the user U exists. If yes, it will then concatenate the password P and current token CT to construct the hardened password(P+CT), as done above in user creation. This hardened password will be used for validating against the hash value in the user entry in /etc/shadow. After successful validation, the user entered password P will be concatenated with the next token NT to create the new hardened password (P+NT). This new password is then hashed, and this new hashed value is used to update the corresponding field in the /etc/shadow file. Based on successful or failed execution of above request, the user will enter the response in the TG, which will decide whether the changes will be saved or discarded. The full functionality of 2FA is visualized in Figure 3[Appendix Section]. 3. Update and Delete 2FA should be able to update the user’s password. This deals with the situation when a user’s password is compromised, or a certain amount of time has elapsed since the password was created. For update, a user must provide username U, password P, new password NP (confirm new password), new salt NS, current token CT and next token NT from the token generator. 2FA should also be able to delete user’s account. For delete, user must provide username U, password P and current token CT. Update and delete functionality should be extrapolated from login functionality.Implementation of 2FA IMPORTANT: Follow the prompts exactly as the instructions below. We will not accept regrade requests based on incorrect prompt order. After becoming familiar with the working of the 2FA method and TG, you must create a standalone program based on the functionality of 2FA method. Please start from the python code that we have provided. Your code must implement the prompt below in the exact order. Your program should be capable of handling the following steps: 1. Prompting user for different requests (10 pts.): ‘1’ for creating new user, ‘2’ for login, ‘3’ for updating password and ‘4’ for deleting user account. Select an action: 1) Create a user 2) Login 3) Update password 4) Delete user accountAlso, prompting the user for appropriate inputs such as username, password, salt, and tokens is also needed.Prompt for action number 1 (create user):Username: Alice Password: Alice123 Confirm Password: Alice123 Salt: salt0123 Initial Token: eYKCaN0kLB7T0.3Q.vPs40Prompt for action number 2:Username: Alice Password: Alice123 Current Token: eYKCaN0kLB7T0.3Q.vPs40 Next Token: iGxl329/ugOeSnhOzYE1B/Prompt for action number 3:Username: Alice Password: Alice123 New Password: New-Password Confirm New Password: New-Password New Salt: salt3210 Current Token: eYKCaN0kLB7T0.3Q.vPs40 Next Token: iGxl329/ugOeSnhOzYE1B/Prompt for action number 4:Username: Alice Password: Alice123 Current Token: Gxl329/ugOeSnhOzYE1B/2. Creating Users (20 pts.): If ‘1’ for creating user is selected from the prompt, the program should do the following a. Prompt for username, password, confirm password, salt, initial token IT (in that order). b. If a user already exists, the program should display “FAILURE: user already exists” and exit (this should be considered as a failed attempt at creating user). The code should prompt for all information in step a before evaluating is the user exists. c. If not, create the user. When you create a user, you should update the /etc/shadow and /etc/passwd file for the user. d. A home directory for that user should be created and there should be an entry of home directory in the passwd file. e. If a user is created, your code should print “SUCCESS: created”. NOTE: The salt will be same for a user account unless the user updates the password or deletes it and then creates it again. 3. Login (20 pts.): If a user enters “2” for login, program should do the following: a. Request the following information from the user: Username: Alice Password: Alice123 Current Token: eYKCaN0kLB7T0.3Q.vPs40 Next Token: iGxl329/ugOeSnhOzYE1B/b. At this point, the complete login process described in 2FA method should be executed. c. On successful completion it should display “SUCCESS: Login Successful”. d. If user does not exist, then it should display “FAILURE: user does not exist”. The code should prompt for user, password, current token, and next token before evaluating is the user exists. e. If password or token is incorrect, then it should display “FAILURE: either passwd or token incorrect.” 4. Password Update (15 pts.): If user enter “3” for update, the program should do the following: a. Ask the user for username, password, new password, confirm new password, new salt, current token and next token in that order and update the account. b. On successful completion it should display “SUCCESS: user updated”. c. Error handling should be done similar to login functionality above. 5. Deleting a user (15 pts.): If user enters “4” for deleting an account, then program should. a. Ask the user for its username, password and current token in that order. b. If correct values are supplied for all of these, all entries and home directory for the deleted user-id should be cleaned. c. On successful completion it should display “SUCCESS: user Deleted”. d. Again, error handling must be done like the login functionality. Please keep the following in mind as you work on the project. 1. After every successful request, user should type ‘y’ or ‘Y’ when asked by the TG. You should not enter ‘Y’ or ‘y’ before completion of the task by your script. 2. Remember, the interaction of 2FA with the TG is manual and tokens must be copypasted across from TG onto your 2FA implementation. 3. You should type the following commands in TG when invoking one of the four functions implemented by you. If you are creating a new user, you must enter ‘1’ when prompted for the user input by TG. Similarly, ‘2’ in case of updating or login, and ’3’ in case of user deletion.3. Task2: Security Analysis of 2FA (20% of grade): Complete a security analysis for the implemented 2FA method. More specifically, • Discuss the advantages (any 2), disadvantages (any 2) and the possible attacks (any 2) on the above given method. • Let us say 2FA is to be implemented in a realistic environment. Recommend one improvement for the current 2FA scheme. • How can one implement the 2FA scheme in a server-client setting, and how will you secure the token transfer between the separate systems?4. Project Deliverables: • Your python script for the 2FA implementation. The file should be name ‘2FA.py’ • A pdf report named ‘_2FA.pdf’. For example, for me, it is ‘jrodriguez_2FA.pdf’. The report should contain: o Your answers for Task2 under section “Task2”5. Appendix FIGURESFigure 2: Creating an AccountFigure 3: Logging into an Account GETTING STARTED ON LINUX LOGIN/PASSWORDS When the Linux system creates a user, it prompts the user for a password. Then, based on the version of Linux, one of six algorithms are chosen for password encryption. The system generates a random salt and uses that salt to generate a one-way hash and store that hash with user details in /etc/shadow file. The user entry looks like the below-given example:As you can see the user entry consists of 9 fields, each separated by the “:”. The first field is the username and the second is hash. The hash contains 3 other fields separated with the dollar sign (“$”). The first field tells us about the hashing algorithm used, in this case, “6” denotes SHA-512. The second field is the salt value used to make hash value unique. The last field is the hash of the combination of your salt and password. You can easily verify the generation of hash by using the perl one-line script on your Ubuntu terminal. perl -e ‘print crypt(“”,”$$$”) . ” “‘ Here, =cs6238 =6 =UPICuFgR Note: Explore all other fields as you will need to know them for project. After storing the hash entry in file /etc/shadow, the system will create a home directory and a user entry in /etc/passwd file which stores essential information required during login, i.e., user account information. This file contains one entry per line for each user. An entry in /etc/passwd for user cs6238 is looks like:Each entry in /etc/passwd has seven fields, each separated with “:”. The first field contains the username. The second field contains the password for the user. “x” denotes that the hashed password entry is in the shadow file. Next two entries are of uid and guid. Last two entries are home directory of the user and the absolute path of the command shell. We are not going to discuss passwd file contents in detail as for the project it is sufficient to know what in the passwd file. However, you are welcome to further explore details of passwd files. After updating the entry in passwd file, user creation completes.

$25.00 View

[SOLVED] Cs6238 project 1

CS 6238: Secure Computer Systems Understanding Memory ProtectionLearning Objectives: The goal of this project is to help students become familiar with memory protection facilities provided by operating systems. In particular, you will learn how to limit access to a certain region of memory (e.g., only read, write, or execute access). The project has two parts. First, you will learn about mprotect(), a memory protection system call. In the second part, you will explore why it is a good idea to disable execution on the stack to safeguard a program’s execution. Memory protection is not limited only to these two examples, there are many different methods to protect the stack and heap. You are encouraged to look at and familiarize yourself with other relevant memory protection mechanisms as well. However, to keep this project simple and of limited scope, we will focus only on the protection of the memory by the above two discussed mechanisms. Thus, the objectives of the project are as follows: 1. Understand how mprotect() works and how it can be implemented. 2. Understand the benefits of turning off the executable stack. Project Setup: Note: The link to the VM will be posted on an Ed Discussion pinned post. For this project, you will be provided with an Ubuntu-based Virtual Machine (VM). This VM was tested on Oracle VM Virtual box 7.0 and can be directly imported to it . This VM has a default account “cs6238” setup with normal user access privileges. You do not need root account credential for completion of this project. Password for the “cs6238” account is “cs6238”. You should not include “” while entering the password. When you log into account “cs6238”, follow these instructions: 1. cd /home/cs6238/Desktop This folder contains 2 additional folders named Stack Protection and M_protect. We are going to use the contents of folder M_protect for completion of Task 1, and similarly contents of Folder Stack Protection for Task 2.Background: This project assumes that you know the basics of C programming language (or can learn it quickly). More specifically, you should be able to understand and modify a given C code snippet. Second, you should know how a basic buffer overflow works. For the understanding of basic buffer overflow, you can refer to course materials in CS 6035, Intro to Information Security. If not, try some Google-fu. Project Details: For both tasks, you should first do some exploration and then conduct specified experiments or answer some questions based on the learning. The project should be straightforward and is more focused on exploring things and how they work.Task 1: Protection of Memory via mprotect() (50% of grade): This task is divided into two parts, (A) Understanding mprotect() (B) Experimenting with programs that use this call. Part (A) Understanding Memory Protection (20% of grade): For this task, students need to navigate to the M_protect directory in Desktop/Project1 via terminal and compile both mprotect.c and without_mprotect.c via the gcc compiler [see https://www.wikihow.com/CompileaC-Program-Using-the-GNU-Compiler-(GCC)]. (NOTE: This is not usually recommended, if there are warnings in your C program, the best practice is to fix them. But for the sake of this project, carry on). Run both the object files you generated and see the difference between their outputs. Do not worry if addresses are different on both the outputs. Look for SIGSEGV when executing the object generated by compiling mprotect.c program. You can refer to the man page of mprotect (http://man7.org/linux/manpages/man2/mprotect.2.html) to learn more about this call. To answer: 1. Why do you get SIGSEGV when you execute the object code generated by compiling the mprotect.c program? 2. How does mprotect() protect memory? 3. What is the minimum size requirement for this call? 4. What happens if you pass the value of the “len” argument as 1? 5. Sam is a programmer, and he needs to protect a 0x380 byte block starting at memory address 0x1234000 and ending at address 0x1234379. This block should be protected from overwriting by some other internal function. Can Sam use the mprotect() function in his C program to protect this memory space? Explain your answer. 6. Review and report what data is being protected in mprotect.c by the mprotect() function. Moreover, discuss if any function can perform read or write on the protected data. If the mprotect call is used in multiple instances, write your observations for each of them.Part (B) Experimentation and Implementation (30% of grade): NOTE: Include a screenshot of all 5 parts in your report. 1. Write the first n bytes of last two pages (9th and 10th page) with your first name where n is equal to number of characters in your first name (print to STDOUT). 2. Now, use mprotect() to allow read and write access on 7th and 8th page. Write your last name in first n bytes of 7th and 8th pages, where n is equal to number of characters in your last name and then try to read it. You should display it on output screen (print to STDOUT). 4. Now, create a buffer of 2 pages and try to copy 7th and 8th page into it. Can you copy it, if not why? 5. Now try to copy 6th page and 9th page into the previously created buffer. Are you able to do so? If not, when your code hit SIGSEGV, is it copying 6th or 9th page? Explain your answer. All the above steps should be followed in the same order, and updates should be made to the same script. Include in the report the screenshots of the print to STDOUT for all tasks 1-5. Deliverables for Task 1: 1. Completed code (Exercise.c). 2. Report.pdf – Report.pdf should contain a section called Task 1 which should contain answers to the above asked questions for Part A and Part B. Note: Do not zip up the deliverables. Upload them separately.Task 2: Non-Executable Stack (50% of grade): This task is also divided into two parts, (A) Understanding the importance of non-executable stack, and (B) Experimenting with vulnerable code with protected and unprotected stack. Part (A) Understanding Stack Protection (30% of grade): Review various mechanisms that are used to protect against buffer overflow exploits. More specifically, research the methods given below and write a brief explanation for each one of them. Each answer must be no more than a paragraph, and definitely no more than half a page of text (diagrams do not count to this limit). 1. Stack smashing via buffer overflow. 2. Stack canary. 3. NX (Non-Executable Stack). 4. Address space layout randomization (ASLR).Part (B) Experiments with execution of code (20% of grade):For this part of task 2, you need to locate the Stack Protection directory via terminal, and there you will find vuln.c. You should review it carefully and then compile this source code into four different binaries with the following gcc options:• gcc -g -O0 -fno-stack-protector -z execstack -o vuln-nosspexec vuln.c • gcc -g -O0 -fno-stack-protector -o vuln-nossp-noexec vuln.c • gcc -g -O0 -z execstack -o vuln-ssp-exec vuln.c • gcc -g -O0 -o vuln-ssp-noexec vuln.cTo answer: (Each answer should be no more than a paragraph.) 1. Explaining the functionality of -fno-stack-protector and -z execstack options of gcc. Explain the differences you see in the binaries when the binaries are created with and without these options. 2. Which of the above four binaries can you exploit, by smashing the stack and overflowing a buffer? 3. Attempt to find the stack canary value in vuln-ssp-exec binary, by using a debugger like gdb. Does the canary value change or remain the same across multiple compilations/ executions? Post screenshots of your attempts. Deliverables: Just a single “Report.pdf” for both Task1 and Task2.

$25.00 View

[SOLVED] Cs6210 project 3- big picture

In this project, you are going to implement major chunks of a simple distributed service using grpc. Learnings from this project will also help you in the next project as you will become familiar with grpc and multithreading with threadpool. Overview You are going to build a store (You can think of Amazon Store!), which receives requests from different users, querying the prices offered by the different registered vendors. Your store will be provided with a file of of vendor servers. On each product query, your store is supposed to request all of these vendor servers for their bid on the queried product. Once your store has responses from all the vendors, it is supposed to collate the (bid, vendor_id) from the vendors and send it back to the requesting client. Learning outcomes Synchronous and Asynchronous RPC packages Building a multi-threaded store in a distributed service Environment Setup To set up your environment, you can choose one of the following methods: – Option 1: Follow this link for a cmake based setup on your host machine – Option 2: Using Docker Option 2: Setting Up the Docker Environment (Skip the whole option 2 section if you choose option 1) If you prefer to use a Docker environment for Project 3, you can either use the pre-build docker image or build and run your own image. Option 2.1: Pre-build Docker image docker pull dcchico/aos_project3 docker run -it dcchico/aos_project3 Option 2.2: Build and Run your own image the following Dockerfile (Skip this step if you use Option 2.1: the pre-build docker image) Copy the code below into a file and name it “Dockerfile”. dockerfile FROM ubuntu:22.04 ENV MY_INSTALL_DIR /.local ENV PATH $MY_INSTALL_DIR/bin:$PATH RUN apt update && apt install -y cmake build-essential autoconf libtool pkg-config git zip unzip && git clone –recursesubmodules -b v1.58.0 –depth 1 –shallow-submodules https://github.com/grpc/grpc /grpc && mkdir -p $MY_INSTALL_DIR WORKDIR /grpc/cmake/build RUN cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF – DCMAKE_INSTALL_PREFIX=$MY_INSTALL_DIR ../.. && make -j 4 && make install WORKDIR /project3 COPY ./project3-template /project3 CMD /bin/bash Building and Running the Docker Image Build the Docker image: docker build -t project3-docker . Run the Docker container: docker run -it project3-docker Troubleshooting Undefined Reference to grpc::Status::OK If you see errors like undefined reference to grpc::Status::OK while running make, add gRPC::grpc++_reflection to the linked libraries list for run_tests in /tests/CMakeLists.txt. Lines 10 to 17 should look like this: add_executable(run_tests client.cc run_tests.cc product_queries_util.h) target_link_libraries(run_tests Threads::Threads gRPC::grpc++ gRPC::grpc++_reflection p3protolib) add_dependencies(run_tests p3protolib) How You Are Going to Implement It ( Step-by-step ) 1. Make sure you understand how GRPC- synchronous and asynchronous calls work. Understand the given helloworld example. You will be building your store with asynchronous mechanisms ONLY. 2. Establish asynchronous GRPC communication between Your store and user client. Your store and the vendors. 3. Create your thread pool and use it. Where will you use it and for what? Upon receiving a client request, you store will assign a thread from the thread pool to the incoming request for processing. The thread will make async RPC calls to the vendors The thread will await for all results to come back The thread will collate the results The thread will reply to the store client with the results of the call Having completed the work, the thread will return to the thread pool 4. Do you have your user client request reaching to the vendors now? And can you see the bids from the different vendors at your user client end? Congratulations you almost got it! Now use the test harness to test if your server can serve multiple clients concurrently and make sure that your thread handling is correct. Keep In Mind 1. Your Server has to handle 2. Multiple concurrent requests from clients 3. Be stateless so far as client requests are concerned (once the client request is serviced it can forget the client) 4. Manage the connections to the client requests and the requests it makes to the 3rd party vendors. 5. Server will get the vendor addresses from a file with line separated strings 6. Your server should be able to accept command line input of the vendor addresses file, address on which it is going to expose its service and maximum number of threads its threadpool should have. 7. The format of the invocation is: ./store 5. Remember to add references to all the resources you have used while working on the project. Given to You 1. run_tests.cc – This will simulate real world users sending concurrent product queries. This will be released soon to you. 2. client.cc – This will be providing ability to connect to the store as a user. 3. vendor.cc – This wil act as the server providing bids for different products. Multiple instances of it will be run listening on different ip address and port. 4. Two .proto files store.proto – Comm. protocol between user(client) and store(server) vendor.proto -Comm. protocol between store(client) and vendor(server) How to run the test setup Go to project3 directory and build the program. Three binaries would be created in the bin folder – store,run_tests and run_vendors. (Note that the location of bin folder depends on how you build the program.) First run the command ./run_vendors vendor_addresses.txt & to start a process which will run multiple servers on different threads listening to (ip_address:ports) from the file given as command line argument. Then start up your store which will read the same address file to know vendors’ listening addresses. Also, your store should start listening on a port(for clients to connect to) given as command line argument. Then finally run the command ./run_tests $IP_and_port_on_which_store_is_listening $max_num_concurrent_client_requests to start a process which will simulate real world clients sending requests at the same time. This process read the queries from the file product_query_list.txt It will send some queries and print back the results, which you can use to verify your whole system’s flow. Grading This project is not performance oriented, we will only test the functionality and correctness. Below is the rubric: Total Possible Score: 12 | Score | Reason | | —– | —— | | +2 | Code compiles | | +4 | Query output is correct | | +3 | Threadpool management | | +1 | store-server operates in async fashion | | +1 | store-client operates in async fashion | | +1 | Readme | Deliverables Submission Directory Structure: Readme.txt src/CMakeLists.txt src/store.cc src/threadpool.h src/any_additional_supporting_files.* You must use the collect_submission.py to create the submission zip file. Submit the zip file in gradescope. You can verify your submission using the autograder in gradescope. FAQ FAQ can be found here.

$25.00 View

[SOLVED] Cs6210 project 2- barrier synchronization algorithms

Overview OpenMP allows you to run parallel algorithms on shared-memory multiprocessor/multicore machines. For this assignment you will implement two spin barriers using OpenMP. MPI allows you to run parallel algorithms on distributed memory systems, such as compute clusters or other distributed systems. You will implement two spin barriers using MPI. Finally, you will choose one of your OpenMP barrier implementations and one of your MPI barrier implementations and combine the two in an MPI-OpenMP combined program in order to synchronize between multiple cluster nodes that are each running multiple threads. You will run experiments to evaluate the performance of your barrier implementations (information about compute resources for running experiments is in a later section). You will run your OpenMP barriers on an 8-way SMP (symmetric multi-processor) system, and your MPI and MPI-OpenMP combined experiments on a cluster of up to 24 nodes with 12 cores each. Finally, you will create a write-up that explains what you did, presents your experimental results, and most importantly, analyzes your results to explain the trends and phenomena you see (some hints for analysis are given below). Detailed Instructions Part 1: Learn about OpenMP and MPI The first thing you want to do is learn how to program, compile, and run OpenMP and MPI programs. Setup Included with this project is a Vagrantfile which provides a VM with the required environment and installs. Assuming you don’t have it installed already, download the latest version of Vagrant for your platform. You’ll also need the latest version of VirtualBox, which can be found here. OpenMP You can compile and run OpenMP programs on any Linux machine that has libomp installed. You can try the example code in the assignment folder (examples/OpenMP). Additional informational resources are as follows: – OpenMP Website – OpenMP Specification – Introduction to OpenMP (video series) – LLNL’s OpenMP Tutorial MPI You can compile and run MPI programs on any Linux machine that has mpich installed (eg. the Vagrant box). Although MPI is normally used for performing computations across different network-connected machines, it will also run on a single machine. This setup can be used for developing and testing your project locally. You can try running the example code in the assignment folder (examples/MPI), as well as looking at the following informational resources: – MPI website – MPICH website – OpenMPI website (for general MPI API documentation) Part 2: Develop OpenMP Barriers Given to you: 1. The barrier function interfaces are specified in omp/gtmp.h in the assignment folder. Don’t change the function signatures. 2. The omp/harness.c is a rudimentary test harness for you to test your implementation. Feel free to modify the harness to your needs. What you need to do: Complete the implementations of your 2 barriers in omp/gtmp1.c and omp/gtmp2.c Part 3: Develop MPI Barriers However you can optionally use it as a third barrier in your experiments, as a baseline/control, if you choose. Given to you: 1. The barrier function interfaces are specified in mpi/gtmpi.h in the assignment folder. Don’t change the function signature. 2. The mpi/harness.c is a rudimentary test harness for you to test your implementation. Feel free to modify the harness to your needs. What you need to do: Complete the implementations of your 2 barriers in mpi/gtmpi1.c and mpi/gtmpi2.c Part 4: Develop MPI-OpenMP Combined Barrier Now choose one of the OpenMP barriers you implemented, and one of the MPI barriers you implemented. Combine them to create a barrier that synchronizes between multiple nodes that are each running multiple threads. You’ll also want to be sure to preserve your original code for the two barriers so that you can still run experiments on them separately. You can compare the performance of the combined barrier to your standalone MPI barrier. Note that you will need to run more than one MPI process per node in the standalone configuration to make a comparable configuration to one multithreaded MPI process per node in the combined configuration, so that total number of threads is the same when you compare. Given to you: You are given a template combined/Makefile which generates a binary named combined to test the combined barrier. What you need to do: Implement the combined barrier along with your own testing harness to generate a binary named “combined”. Please provide the appropriate Makefile. The invocation for the binary will be as follows: mpiexec.mpich -np ./combined Note that you are free to create your own harness for the combined barrier. The gradescope autograder will only test for compilation and run of the combined barrier. Part 5: Run Experiments 1. The next step is to do a performance evaluation of your barriers on a large cluster(PACE). Information on how to use the cluster is described under Resources. 3. You will measure your OpenMP barriers on a single cluster node, and scale the number of threads from 2 to 8. 4. You will measure your MPI barriers on multiple cluster nodes. You should scale from 2 to 12 MPI processes, one process per node. 5. You will measure your MPI-OpenMP combined barrier on multiple cluster nodes, scaling from 2 to 8 MPI processes running 2 to 12 OpenMP threads per process. Some things to think about in your experiments: 2. You can use the gettimeofday() function to take timing measurements. See the man page for details about how to use it. You can also use some other method if you prefer, but explain in your write-up which measurement tool you used and why you chose it. Consider things like the accuracy of the measurement and the precision of the value returned. 3. If you’re trying to measure an operation that completes too fast for your measurement tool (i.e., if your tool is not precise enough), you can run that operation several times in a loop, measure the time to run the entire loop, and then divide by the number of iterations in the loop. This gives the average time for a single loop iteration. Think a moment about why that works, and how that increases the precision of your measurement. 4. Finally, once you’ve chosen a measurement tool, think a bit about how you will take that measurement. You want to be sure you measure the right things, and exclude the wrong things from the measurement. You also want to do something to account for variation in the results (so, for example, you probably don’t want to just measure once, but measure several times and take the average). Part 6: Write-Up The last part is to create the write-up. This should a PDF file and it should include a minimum of the following: 1. The names of both team members 2. An introduction that provides an overview of what you did (do not assume the reader has already read this assignment description). 3. An explanation of how the work was divided between the team members (i.e., who did what) 4. A description of the barrier algorithms that you implemented. You do not need to go into as much implementation detail (with pseudocode and so forth) as the MCS paper did. However, you should include a good high-level description of each algorithm. You should not simply say that you implement algorithm X from the paper and refer the reader to the MCS paper for details. 5. An explanation of the experiments, including what experiments you ran, your experimental set-up, and your experimental methodology. Give thorough details. Do not assume the reader has already read this assignment description. 6. Your experimental results. DO present your data using graphs. DO NOT use tables of numbers when a graph would be better (Hint: a graph is usually better). DO NOT include all your raw data in the write-up. Compare both your OpenMP barriers. Compare both your MPI barriers. Present the results for your MPI-OpenMP barrier. 7. An analysis of your experimental results. You should explain why you got the results that you did (think about the algorithm details and the architecture of the machine on which you experimented). Explain any trends or interesting phenomena. If you see anything in your results that you did not expect, explain what you did expect to see and why your actual results are different. There should be at least a couple of interesting points per experiment. The key is not to explain only the what of your results, but the how and why as well. 8. A conclusion. Resources 1. You will have access to the coc-ice PACE cluster for use with this project. 3. Please refer to the Cluster-HOWTO for details on using the PACE cluster. Submission Instructions Submit the following to the Project 2 module in Gradescope: project2/ omp/ Makefile gtmp.h gtmp1.c gtmp2.c harness.c mpi/ Makefile gtmpi.h gtmpi1.c gtmpi2.c harness.c combined/ Makefile (generates the “combined” binary) *.c (all required sources) *.h (all required headers) 2. Report.pdf – Your write-up (as a single PDF file) that includes all the things listed above will be submitted to the same Gradescope module.

$25.00 View

[SOLVED] Cs6210 project 1- vm cpu scheduler and memory coordinator

README.md. Project Overview During one interval, the vCPU scheduler should track each guest machine’s vCpu utilization, and decide how to pin them to pCpus, so that all pCpus are “balanced”, where every pCpu handles similar amount of workload. The “pin changes” can incur overhead, but the vCPU scheduler should try its best to minimize it. Similarly, during one interval, the memory coordinator should track each guest machine’s memory utilization, and decide how much extra free memory should be given to each guest machine. The memory coordinator should set the memory size of each guest machine and trigger the balloon driver to inflate and deflate. The memory coordinator should react properly when the memory resource is insufficient. Tools that you will need: qemu-kvm, libvirt-bin, libvirt-dev are packages you need to install so that you can launch virtual machines with KVM and develop programs to manage virtual machines. libvirt is a toolkit providing lots of APIs to interact with the virtualization capabilities of Linux. Virtualization is a page you should check. Environment Setup 1. You have two main options for setting up your development environment: 2. Cloud VM on Azure: Ideal for both development and testing. 3. Preconfigured Local Environment: Use the provided setup with Vagrant and VirtualBox. 4. If you’re on Windows (Hyper-V enabled) or macOS, it’s recommended to go with the Azure cloud VM. 5. Refer to EnvironmentSetup.md (section: Setting Up Your Environment) for step-by-step instructions on: 6. Configuring an Azure cloud VM. 7. Setting up your environment with Vagrant and VirtualBox. 8. If you choose Vagrant or a local setup, ensure your system meets these requirements: 9. At least 6 GB of RAM. 10. 4 physical CPU cores. 11. Default SSH login for the Vagrant VM: Username: vagrant Password: vagrant 12. The EnvironmentSetup.md file also provides instructions for creating VMs on a KVM hypervisor (section: Creating Test VMs). âš ï¸ Important: If you decide to configure your environment manually, make sure: – Your machine or VM settings match those specified in the Vagrantfile. – All required packages are installed. Where can I find the APIs I might need to use? 1. libvirt-domain provides APIs to monitor and manage the guest virtual machines. 2. libvirt-host provides APIs to query the information regarding host machine. Directory layout This directory contains a boilerplate code, testing framework, and example applications for evaluating the functionality of your CPU Scheduler and Memory Coordinator. The boiler plate code is provided in /cpu/src/ and /memory/src/ folders. Details for testing the CPU Scheduler can be found in cpu/test/ folder and details for testing the Memory Coordinator can be found in the memory/test/ folder. Project Flow Refer to the flowchart below to help understand the overall project concept. We are taking the CPU scheduler as an example here.VCPU Scheduler Tasks 1. Complete the function CPUScheduler() in vcpu_scheduler.c. 2. If you are adding extra files, make sure to modify the Makefile accordingly. 3. Compile the code using the command make all. 4. You can run the code by ./vcpu_scheduler . For example, ./vcpu_scheduler 2 will run the scheduler with an interval of 2 seconds. 5. While submitting, write your algorithm and logic in the readme cpu/src/Readme.md. Step-by-Step Guide 1. Connect to the Hypervisor: 2. Use the virConnect* functions in libvirt-host to establish a connection. 3. For this project, connect to the local hypervisor at qemu:///system. 4. List Active Virtual Machines: 5. Retrieve all actively running virtual machines within qemu:///system using the virConnectList* functions. 6. Collect VCPU Statistics: 7. Use the virDomainGet* functions from libvirt-domain to gather VCPU statistics. 8. If host PCPU (physical CPU) information is also required, use the relevant APIs in libvirt-host. 9. Handle VCPU Time Data: 10. VCPU time is typically provided in nanoseconds, not as a percentage. 11. Transform this data into a usable format or incorporate it directly into your calculations. 12. Determine VCPU to PCPU Mapping: 13. Use the virDomainGet* functions to identify the current mapping (affinity) between VCPUs and PCPUs. 14. Develop Your Algorithm: 15. Based on the collected statistics, design an algorithm to find the “best” PCPU for each VCPU. 16. Optimize for efficient CPU usage while ensuring no PCPU is over- or under-utilized. 17. Update VCPU-Pinning: 18. Use the virDomainPinVcpu function to dynamically assign each VCPU to its optimal PCPU. 19. Create a Periodic Scheduler: 20. Start with a “one-time scheduler” to establish a baseline. 21. Revise it to run periodically for ongoing optimization. 22. Test Your Scheduler: 23. Launch several virtual machines and simulate workloads to consume CPU resources. 24. Evaluate the scheduler’s performance by observing how well it balances and stabilizes CPU usage across PCPUs. Key Considerations Algorithm Requirements – The algorithm must be independent of the number of VCPUs and PCPUs. – It should handle all configurations, including: – #VCPUs > #PCPUs: More virtual CPUs than physical CPUs. – #VCPUs = #PCPUs: Equal number of virtual and physical CPUs. #VCPUs < #PCPUs: Fewer virtual CPUs than physical CPUs. – A generic approach that focuses on stabilizing processor usage is sufficient and will naturally handle these cases without requiring specific logic for each scenario. – The expectation of the tescases provided operate under the assumption of 8 vcpus and 4 pcpus. But they can be extended to a different count of pcpus. For example, for an 8 core system (8 pcpus), you should be able to evaulate your algorithm for a setup of 16 VMs (16 vcpus) with similar expectation.Memory Coordinator Tasks 1. Complete the function MemoryScheduler() in memory_coordinator.c. 2. If you are adding extra files, make sure to modify Makefile accordingly. 3. Compile the code using the command make all. 4. You can run the code by ./memory_coordinator . For example, ./memory_coordinator 2 will run the coordinator with an interval of 2 seconds. 5. While submitting, write your algorithm and logic in the readme memory/src/Readme.md. Step-by-Step Guide 1. Connect to the Hypervisor: 2. Use the virConnect* functions in libvirt-host to establish a connection. 3. For this project, connect to the local hypervisor at qemu:///system. 4. List Active Virtual Machines: 5. Retrieve all active virtual machines within qemu:///system using the virConnectList* functions. 6. Enable Memory Statistics Collection: 7. Use the virDomainSetMemoryStatsPeriod function to configure memory statistics collection. 8. Retrieve Memory Statistics: 9. Decide which memory statistics are relevant for your use case. 10. Use the virDomainGet* and virDomainMemory* functions to fetch the required data. 11. Fetch Host Memory Information: 12. Use the virNodeGet* functions in libvirt-host to gather host memory details. 13. Design Your Algorithm: 14. Develop a policy to allocate extra free memory to each virtual machine based on the collected statistics. 15. Decide how much memory should be reserved and how much can be dynamically allocated. 16. Update Memory Allocation: 17. Use the virDomainSetMemory function to dynamically adjust the memory for each virtual machine. This triggers the balloon driver. 18. Create a Periodic Memory Scheduler: 19. Start with a “one-time scheduler” and revise it to run periodically for continuous optimization. 20. Test the Memory Coordinator: 21. Launch several virtual machines and simulate memory usage by running test workloads. 22. Gradually consume memory resources and evaluate the performance of your memory scheduler.Key Considerations Algorithm Requirements 1. Ensure that both the VMs and the host retain sufficient memory after releasing any memory. 2. Release memory gradually: – For example, if a VM has 300 MB of memory, do not release 200 MB in a single step. 3. Maintain a minimum of 100 MB of unused memory for each VM. 4. The host should not release memory if it has less than or equal to 200 MB of unused memory. Recording Test Results with script To validate your scheduler or coordinator, use the script command to record terminal sessions and store the results in a log file. This process ensures a detailed record of test outcomes. Follow these steps: Steps to Record Test Results 1. Start Recording: 2. Run the script command to start recording terminal output: bash script vcpu_scheduler1.log or bash script memory_coordinator1.log 3. Run the Monitor: 4. Execute the monitor test using the monitor.py script: bash python3 monitor.py -t runtest1.py Replace runtest1.py with the appropriate test case. Run 3 test cases each for CPU and memory. 5. Run Your Scheduler or Coordinator: 6. Launch your program on a separate terminal to perform scheduling or memory coordination. 7. Stop Recording: 8. Exit the script command by typing: bash exit 9. Verify that the log file has been generated and contains the expected results. 10. Repeat for All Test Cases: 11. Ensure you generate separate log files for each test case. 12. For accurate results, reboot your VMs before running each test.Additional Information – The log file will be saved in the current working directory upon exiting the script command. – For more details about script, refer to its manual by running: bash man script Testing Process 1. Testing the CPU Scheduler: 2. Follow the instructions provided in ./cpu/test/HowToDoTest.md. 3. There are 3 test cases for the CPU scheduler. Detailed scenarios and expected outcomes for each test case are available in ./cpu/test/HowToDoTest.md. 4. Testing the Memory Coordinator: 5. Follow the instructions provided in ./memory/test/HowToDoTest.md. 6. There are 3 test cases for the memory coordinator. Detailed scenarios and expected outcomes for each test case are available in ./memory/test/HowToDoTest.md. Note: In the autograder environment: – Up to 4 VMs will be used for memory coordinator tests. – Up to 8 VMs will be used for VCPU scheduler tests. – Each VM will be configured with 1 VCPU. Grading This is not a performance-oriented project, we will test the functionality only. Please refer to the sample output pdf to understand the expected behavior from the scheduler and coordinator across test cases on the autograder. More details can be found in the test directories described in the testing section. The Rubric will be: 1. vCPU scheduler functionality – 6 points – The scheduler should aim to make the pCPUs enter a stable and a balanced state. – 1 point for Readme. – 1 points for first 3 test cases whose implementation is available. – 4 for the remaining 4 test cases. 1. Memory coordinator functionality – 6 points The VMs should consume or release memory appropriately for each test case. Don’t kill the guest operating system (Do not take all the memory resource from guests). Don’t freeze the host (Do not give all available memory resources to guests). 5 points for implementation (1.25 per test case) and 1 point for Readme. Deliverables & Submission You need to implement two separate C programs, one for vCPU scheduler (/cpu/src/vcpu_scheduler.c) and another for memory coordinator(/memory/src/memory_coordinator.c). Both programs should accept one input parameter, the time interval (in seconds) your scheduler or coordinator will trigger. For example, if we want the vCPU scheduler to take action every 2 seconds, we will start your program by doing ./vcpu_scheduler 2. Note that the boiler plate code is provided in the attached zip file. You need to submit one zipped file named FirstName_LastName_p1.zip (e.g. George_Burdell_p1.zip) containing a subfolder named FirstName_LastName_p1 and two separate subfolders(cpu and memory) within the FirstName_LastName_p1 subfolder, each containing a Makefile, Readme.md (containing code description and algorithm), source code and the log files generated through script command for each test case. Use the script collect_submission.py to generate the zip file. We will compile your program by just doing make. Therefore, your final submission should be structured as follows after being unzipped. Don’t change the name of the files. Please adhere to the submission instructions, not doing so will result in a penalty of points. – FirstName_LastName_p1/ – cpu/ – vcpu_scheduler.c – Makefile – Readme.md (Code Description and algorithm) – 3 vcpu_scheduler.log files — vcpu_scheduler1.log and so on for each test case – memory/ – memory_coordinator.c – Makefile – Readme.md (Code Description and algorithm) – 3 memory_coordinator.log files — memory_coordinator1.log and so on for each test case To generate the final zip file, ensure that all the required files are present and run the following command: python3 collect_submission.py Once you’ve successfully created a zip folder as per instructions, you must upload that zip folder on Gradescope. Keep in mind that each submission will take around 40 minutes to autograde during off-peak hours, so be advised to submit early!

$25.00 View

[SOLVED] Comp90041 assignment 2

Welcome to Assignment 2 for COMP90041 – Programming and Software Development! In this assignment, we will extend the Calendar Console to add new features like – • Calendar Views – Monthly, Fortnightly, Weekly • Different kinds of events – Timed Events, All-day Events • Loading the calendar events from a file • Saving the modified calendar events into a file • Managing Exceptions while reading the file.Preamble: “The Specifications” The slides from this lesson module for the assignment act as “the specifications” for the system we are about to build. “The specifications” in real-life software development is the document that details the features that should be implemented in any particular project. They are the features and requirements that you, the Software Developer, and the client have agreed should be implemented in the system. As such, you should read these specifications carefully and ensure that your program implements the requirements of the specification correctly. Tests will be run on your program to check that you have implemented these specifications correctly. Note that for the Assignment 2, we will provide 10 visible tests and 6 hidden tests that you can run to check that the basic functionality of your system is correct. Tip: Look at the assets file present. Think about what can the hidden test cases be about by looking at the data and perform your own testing. So once your program passes the basic tests, we strongly recommend that you perform further tests yourself, to ensure that the required features have been implemented correctly. How to read the specifications? • We will show some of the scenarios as valid/expected/best-case scenarios. This section will be marked in green. This means that for these sections, we will not provide any incorrect inputs or ill-formatted inputs. Best Case Scenario: This section describes a real-life intended scenario. • Some of the specifications will be referred back from Assignment 1. This will be notified using a green strip as shown below – Carried Forward: This section is referred as is from Assignment 1 • Some inputs to the program can be invalid and should be handled accordingly by the program. These unexpected but valid scenarios will be tested and are shown below – • Other specifications can be referred from Assignment 1 with slight modifications. This will be notified using a yellow strip. Modifications: This section is referred from Assignment 1 with slight modifications • New additions to the assignment will be marked using blue callouts like below- Addition: This specification is entirely new to Assignment 2. embedded as a red strip in the specifications. Out of Scope Scenario: This scenario …….. is out of scope for this assignment callouts. • Also, note that the code snippets have some bold characters that represent the inputs to the program.Preamble: Intended Learning Outcomes The Intended Learning Outcomes for the final Project are mentioned below – • Control Flows – use branching and looping to navigate the special use cases in specifications. • Classes – identify the entities and encapsulate their data and actions together in a class. • Arrays – to use 1D and 2D arrays and perform associated operations • Packages – identify and group logical entities(classes) together • Javadoc – generate javadocs from comments used in classes • UML – generate a UML Diagram for the classes • Inheritance – implement polymorphism correctly • Interfaces – implement some operations via interfaces • File Handling – reading and writing data from/to different files. Don’t use java.nio.Files methods. • Exception Handling – handle exceptions gracefully and appropriately applies to incorrect usage. • Generics – Generics are optional for this assignment. Implementing Generics is not the same as using Generics like ArrayList. NoPreamble: Structure and Style We will also be assessing your code for good structure and style. Use methods when appropriate to simplify your code, and avoid duplicate code. Use correct Java naming conventions for class names, variable names, and method names and put them in a well-organised way in your code i.e. it should be readable as well. Look at the conventions provided by Oracle here. The code structure in a file is very important and improves readability. Look at the code organisation conventions provided by Oracle here. Make sure the names you choose are meaningful to improve the readability of your code. Ensure to add meaningful comments in between your code and Javadoc comments for classes and methods. We will provide a marking scheme to help guide you in the development of your program. Also we will guide you how to model your program. But using correct syntaxes and conventions can be learnt here.Calendar Entities Calendar Console have changed. Here is some information about the data Calendar Console can process. • Monthly – This view is inherited from Assignment 1. This has 6 rows for weeks. • Fortnightly – This view is new for Assignment 2. This view shows 2 weeks. • Weekly – This view only shows 1 week. Each event in a calendar entry has a description. Events can either be • a timed event – A timed event can be of type MEETING, REMINDER. Timed Events also have start and end times in the format HH: mm. • an all-day event – An all-day event does not have any start or end time but runs all day. These could be of type BIRTHDAY, ANNIVERSARY, OUT_OF_OFFICE, PUBLIC_HOLIDAY are also considered non-working days. Assumption: Though Sat/Sun are non-working days but are not tested in this Assignment 2. You can ignore handling any special scenarios for these days in your assignment A timed event on a non-working day has a special criterion. If someone tries to set up a meeting that conflicts with an existing scheduled meeting or a meeting is set up on a non-working day, the program will send an email to reschedule the event. Pro Tip: Sending an email here simply uses print statements. No overly complicated coding required for the assignment.File Handling & Command Line Args Command Line Arguments • Calendar view type to show the number of rows. These values could be MONTHLY, FORTNIGHTLY, or WEEKLY. • An events file path to load the events from $java CalCon 2025-05-26 MONTHLY assets/events1.txt Error Scenarios The program can face below error scenarios and must handle them by printing the error messages accordingly and terminating the program. • Less than 3 arguments present. Print Invalid Command Line Args. Exiting Program. Program. • Invalid view type provided. E.g. – MONTHLYYYY. Print Invalid calendar view type. Exiting Program. If valid command line arguments are present, initialise the calendar. File Handling Once the calendar is initialised, it is important to read some data from the files and load them into the appropriate objects of classes. The files are present in assets folder. Assumption: While handling the files in code, you can assume the folder name assets remain unchanged. File Data Description The file has data in a comma-separated format. Where data is missing, the comma will still be present. The file has a header that describes the data present in it. The order of the data will always remain the same. 2. event_type (Mandatory) – TIMED or ALL_DAY event. Other values will be considered Invalid. 3. event_sub_type (Mandatory) – subtypes for TIMED or ALL_DAY events. See the list associated with the types here. Other values will be considered Invalid. 4. start_time (Optional) – this is only present in case of TIMED events and represents the start time of the event. Otherwise empty. 5. end_time (Optional) – this is only present in case of TIMED events and represents the end time of the event. Otherwise empty. 6. Description (Mandatory) – a description for the event. Order of reading the data 1. The program should start by reading the Events file present in the command line arguments. 2. The first row is the header row, just for your information. The program must ignore this while reading the data. 3. Any empty line is simply skipped. Read the next available line. 4. Iterate through all the rows of the file mentioned in the command line args, then the program should start processing each line by splitting the data by comma and processing individual data points. invalid data or an invalid line format. In this case, skip the line and proceed to read the next line. The program must not be terminated. See the next section for Exception handling. Writing the files back The program must write the changes made to the events during the program execution back to the events file using the filename provided to the program in the command line argument. The data is read from the calendar. For those dates where one or more events are present, the program will write the data back to the file. The order of the events is maintained by the order of insertion in the events array. While writing the data back, the header must be reserved as is. The program must update the events file only once at the end of the program when the user has quit from the main menu (See Option 4 in Main Menu slides) IMPORTANT: Do not forget to use printWriter.flush() methods even though it is not BufferedWriter. Sometimes files in EdStems are not written back. As you write into the files, your files modify. While running all the test cases, subsequent test cases works on modified file data. So if you face any error, recheck the file data saved. NOTE: Windows vs Linux FileSystems handle folder path differently, You must comply to what EdStem follows i.e. Linux style paths.Exception Handling with Files Exceptions during File Handling 1. FileNotFoundException or IOException – The file paths that are provided to the program are not present in the directory. Or are unavailable for File read/write operations. In this case, Java raises FileNotFoundException or IOException. Your program should print the error message and terminate the program. Generally, these would never happen while writing the file, but only while reading the file. But the program must handle both cases in the code. Unable to process file. Exiting program. 2. InvalidLineException – Every file expects a minimum number of fixed data points. When there are fewer than 6 data points present in a line, the program should raise an InvalidLineException, skip reading the line, print an error message and move on to the next line. Invalid Event Line. Skipping this line. 3. InvalidFormatException – Some data points are expected to be in a particular format or are mandatory. If they are not in the correct format, the program must print an appropriate error message, skip the line and move on to the next line. Note that if one line has more than one error, only first error encountered is printed and the program moves on to the next line. The order of processing the datapoints and respective errors is as follows – • If the event type is missing or incorrect, print Incorrect Event type. Skipping this line. • If the event subtype is missing or incorrect, print Incorrect All Day event type. Skipping this line. or Invalid Timed event type. Skipping this line. • If the start or end times are empty, print Start Time cannot be empty. Skipping this line. or End Time cannot be empty. Skipping this line. • If the start or end times are incorrect format, like 25:67, print Incorrect Start Time format. Skipping this line. or Incorrect End Time format. Skipping this line. • If the event description is empty, print Event Description is empty. Assumption: Since we are reading a comma separated file, none of the string data like description will have a comma as a data. Comma is only used to represent data delimiter and not data itself. 4. NotFoundException – Worst Case Scenario: Remember that the method that throws the exception does not catch and handle it. See the marking scheme. Exceptions during Program Execution There are no exceptions thrown during Program Execution, but they are handled using logic. See the next few slides on program execution for details.Main Menu Calendar Initialisation Modifications: This section is referred from Assignment 1 with slight modifications. FORTNIGHTLY ———————————— | M | T | W | T | F | S | S | ———————————— | 02 | 03 | 04 | 05 | 06 | 07 | 08 | | x | | | | | | | | | | | | 2 | | | ———————————— | 09 | 10 | 11 | 12 | 13 | 14 | 15 | | | | | | | | | | | | | | | | | ———————————— WEEKLY ———————————— | M | T | W | T | F | S | S | ———————————— | 02 | 03 | 04 | 05 | 06 | 07 | 08 | | | | | | x | | | | 2 | | 1 | | 2 | | | ———————————— The next thing to show is the main menu for the user to select and perform some operations. Modifications: This section is referred from Assignment 1 with slight modifications. ———————————— | M | T | W | T | F | S | S | ———————————— | 02 | 03 | 04 | 05 | 06 | 07 | 08 | | x | | | | | | | | 2 | | 1 | | 2 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | x | | | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 1 ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | | x | | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > Note: The text in bold represents user input here. You don’t need to make the console inputs bold in your code. Option 2: Sub Menu Carried Forward: This section is referred as is from Assignment 1 ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | | | x | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 2 Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. Submenu handling is discussed here. Option 3: Printing all events. Modifications: This section is referred from Assignment 1 with slight modifications. If the user wants to see all the events marked in the calendar, the program must present them in a tabular form. The program must show only the events present in the calendar view and not all the events read from the file. ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | | | x | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 3 —————————————————————————- ——————————————————————– —————————————————————————- ——————————————————————– 1 2025-05-28 ALL_DAY OUT_OF_OFFICE On Vacation – Canada2 2025-05-28 TIMED MEETING 09:00 09:20 Some random event3 2025-05-29 ALL_DAY OUT_OF_OFFICE On Vacation – Canada4 2025-05-30 ALL_DAY OUT_OF_OFFICE On Vacation – Canada—————————————————————————- ——————————————————————– Tip: Use the formatter %2s%12s%12s%20s%12s%12s%40s%n to print the header and the events. However, if the calendar has no events, the program should show an error message. ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | | | x | | | | | | | | | | | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 3 No events present in the calendar. ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | | | x | | | | | | | | | | | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. Option 4: Exit Option Carried Forward: This section is referred as is from Assignment 1 In case the user selects option 4, the program must gracefully quit the program and print a goodbye message. Look at the complete output below. ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | x | | | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 4 Exiting CalCon Now. Invalid Command Carried Forward: This section is referred as is from Assignment 1 In case the user provides an invalid input, like 8 or 9, the program should be able to print “Invalid Input” and show the main menu. Sample output ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | x | | | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > 6 Invalid input. ———————————— | M | T | W | T | F | S | S | ———————————— | 26 | 27 | 28 | 29 | 30 | 31 | 01 | | x | | | | | | | | | | 2 | 1 | 1 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > Out of Scope: The program only expects an integer value for the main menu. You do not need to handle string or double values as input at this point of time for the main menu.SubMenu Once the submenu is printed, the user can take multiple actions. Option A: Adding an event Modifications: This section is referred from Assignment 1 with slight modifications Adding a Timed Event Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > A To add an event, enter the following details Event Type (ALL_DAY, TIMED) : TIMED Event SubType (REMINDER, MEETING) : MEETING Start Time (HHmm) : 10:30 End Time (HHmm) : 11:30 Description : Some random meeting Event added successfully. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Adding an All Day Event Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > A To add an event, enter the following details Event Type (ALL_DAY, TIMED) : ALL_DAY Event SubType (BIRTHDAY, ANNIVERSARY, OUT_OF_OFFICE, PUBLIC_HOLIDAY) : OUT_OF_OFFICE Description : Sick Leave Event added successfully. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Assumption: All inputs are valid. Invalid inputs not tested in add/edit events. Option E: Editing an event Modifications: This section is referred from Assignment 1 with slight modifications Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > E To edit an event, select an event number from below – Following event(s) found for the day – 1. MEETING Event : 15:00 – 16:00 : Coffee chat with mentor 2. REMINDER Event : 08:00 – 08:05 : Check dishwasher maintenance 3. REMINDER Event : 15:00 – 15:15 : Call Tax office > 3 To edit this event, enter the following details Event Type (ALL_DAY, TIMED) : TIMED Event SubType (REMINDER, MEETING) : REMINDER Start Time (HHmm) : 16:00 End Time (HHmm) : 16:15 Description : Call Tax office Event updated successfully. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Editing an All Day Event Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > E To edit an event, select an event number from below – Following event(s) found for the day – 1. OUT_OF_OFFICE Event : Sick Leave > 1 To edit this event, enter the following details Event Type (ALL_DAY, TIMED) : ALL_DAY Event SubType (BIRTHDAY, ANNIVERSARY, OUT_OF_OFFICE, PUBLIC_HOLIDAY) : OUT_OF_OFFICE Description : Sick Leave updated Event updated successfully. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Option D: Deleting an event Carried Forward: This section is referred as is from Assignment 1 Deleting an event is similar to editing an event. The user must choose the event number from the list of events present on the day to delete. See the output below – Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > D To delete an event, select an event number from below – Following event(s) found for the day – 1. MEETING Event : 15:00 – 16:00 : Coffee chat with mentor 2. REMINDER Event : 08:00 – 08:05 : Check dishwasher maintenance > 2 Event deleted successfully. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > If the user selects option V after this, the deleted event should not be present in the list. Option V: Viewing an event Modifications: This section is referred from Assignment 1 with slight modifications Viewing a day’s events will simply list all the events present on the selected day, but in a concise manner. • Timed Events show the event subtype, start and end time along with the description. • All Day Events show the event subtype and the description. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > V Following event(s) found for the day – 1. OUT_OF_OFFICE Event : On Vacation – Canada 2. MEETING Event : 09:00 – 09:20 : Some random event Tip: Use the formatter “%d. %s Event : %s – %s : %s%n” to print a Timed event for the day and “%d. %s Event : %s%n” to print a All Day event. Option Q: Quitting the submenu Carried Forward: This section is referred as is from Assignment 1 If the user selects Q, the program must quit the submenu, print the calendar view and go back to the main menu. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Q ———————————— | M | T | W | T | F | S | S | ———————————— | 02 | 03 | 04 | 05 | 06 | 07 | 08 | | x | | | | | | | | 2 | | 1 | | 2 | | | ———————————— Select an option to proceed. Press 2 to enter current selection’s sub menu. Press 3 to view all events in a calendar. Press 4 to exit. > Invalid Option Carried Forward: This section is referred as is from Assignment 1 Note that the submenu receives an input string for A/E/D/V/Q. Thus, any input which is a string but is not a valid option should be handled accordingly, with an error message printed. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > K Invalid input. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Special Case Case 1: No events present for the day. Carried Forward: This section is referred as is from Assignment 1 In case of editing/deleting/viewing, if there are no events present for the day, show the error message and print the submenu. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > V There are no events marked in the calendar for the day. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Case 2: Meetings on Non-Working Days Addition: This specification is entirely new to Assignment 2. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > A To add an event, enter the following details Event Type (ALL_DAY, TIMED) : TIMED Event SubType (REMINDER, MEETING) : MEETING Start Time (HHmm) : 10:30 End Time (HHmm) : 11:30 Description : Some random meeting Event added successfully. I am not available today. Please reschedule the event. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Assumption: Adding other kind of events to a Non Working day is not tested. Case 3: Conflicts in various meeting timings Addition: This specification is entirely new to Assignment 2. While adding/editing the meeting has a time overlap with other meetings i.e. • the meeting has same start time as any other meeting • the meeting has an end time after the start of any other meeting but before the end time of the other meeting • the meeting has a start time after the start time of the other meeting but before the end time of the other meeting • the meeting has a start time before the start time of the other meeting, but has an end time after the start time of the other meeting. In this case, the program should send an email(create a method sendEmail() that prints the error message There is a conflict for this event, Reschedule the event. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > V Following event(s) found for the day – 1. REMINDER Event : 14:00 – 14:15 : Take medicine 2. MEETING Event : 10:30 – 11:30 : Team sync-up Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > A To add an event, enter the following details Event Type (ALL_DAY, TIMED) : TIMED Event SubType (REMINDER, MEETING) : REMINDER Start Time (HHmm) : 10:45 End Time (HHmm) : 11:00 Description : Call mom. Event added successfully. There is a conflict for this event, Reschedule the event. Press V to view all events. Press A to add an event. Press E to edit an event. Press D to delete an event. Press Q to exit. > Note: 1. Reminder and Meeting can conflict. Also, Non Working Day and meeting can conflict but is considered in Case 2. 2. SendEmail functionality only applies to certain AllDayEvent and all TimedEvent. 3. See FAQ to understand when to send an email for Case 2 and Case 3. Out of Scope Scenario: Birthday/Anniversary overlapped with Meeting/Reminder/Non-Working Days are not tested.Guidance: Object Oriented Programming Quick Tips • Now that you know about UML, perhaps start designing your solutions by creating a UML diagram first. Think about what classes need to be created. Some of them are provided to you as a scaffold. You can create other classes as well. • The second step is to associate appropriate data with appropriate classes. Create the data fields as instance variables in those classes. • The third step would be to create the exceptions. • The next step is file handling. This will take the maximum amount of time. • The next step is to create a structure for running the menu options with the different Control Flows you have learned. • Exceptions thrown in a method are not handled in the same method. You should handle them with try-catch block in some other methods where you are invoking the method that is causing the exception. For example, when you use Integer.parseInt and the parseInt method throws a NumberFormatException, you handle it in the method where you are using the Integer.parseInt. • Create packages to logically group entities and interfaces, and exceptions as well. • Think about enums. Interfaces & Inheritance • Implement the inheritance between related entities. You must have at least one inheritance hierarchy. • Think about what the common operations are for different entities. Interfaces are implemented where unrelated/related entities can have similar behaviour but different implementations. You must create and implement at least one interface. File Parsing • Use a simple Scanner and PrintWriter instead of over-complicating the reading/writing. Try to minimise duplicate code associated with reading and writing. • Do not forget to use flush() for writing. • When you read the files, you can read the entire line and then split the data on commas. This shall provide you with different data points in an array. Check the split method in the String class here. You can use this method to perform a split on comma. JavaDoc Your code should be annotated using javadoc comments. We will generate the javadoc for your code. You do not need to submit it. You can run javadoc on your machine (up to any levels of nested packages) by running the command $ javadoc -d docs/ **/*.java UML assignment. • There are some classes present, but they are not grouped in packages. You must create packages and add the classes to some packages. • There are some sample packages created that you can use. methods present there. • The italicised class means it is an abstract class also denoted by , and the italicised method is an abstract method. You are free to change the definition of the abstract method by adding parameters if you see that is fit, but the Event class must be abstract.

$25.00 View