Assignment Store | Assignment Chef

[SOLVED] Cs6262_summ25 project 3- malware analysis

Summer 2025 Project Overview In this project, you will analyze malware samples on Windows, Linux, and Android. Your tasks include identifying behaviors, reconstructing C2 servers, and analyzing communication and attack patterns. Each stage includes 4–6 behavior-related questions.Part 1: Windows Malware (stage1.exe and stage2.exe) Use static and dynamic analysis with tools like Wireshark, Cuckoo Sandbox, and angr to capture traffic, find valid commands, and analyze behaviors such as file downloads and system changes.Part 2: Linux Malware (payload.exe) Analyze instruction traces to uncover communication points and attack logic using symbolic execution, loop detection, and custom scripts.Part 3: Android Malware (sms.apk) Use jadx and apktool to reverse engineer the app, uncover SMS-based C2 behavior, and bypass emulator checks to trigger hidden actions.Tutorials We provide step-by-step guides for tools like Wireshark, Cuckoo Sandbox, angr, and some customized tools at the end of this writeup. You are encouraged to use these guides to get started, but you’re also free to use your own methods.Submission Instructions1. Submit report.zip to Canvas under the Project 3 assignment. 2. Submit assignment-questionnaire.txt to Gradescope.Environment Setup Virtual Box Make sure to install/update to the latest version of VirtualBox ● https://www.virtualbox.org/wiki/DownloadsProject VM Setup Download the Virtual Machine (VM) ● https://www.dropbox.com/s/dnk6acztw9ewp83/Project%203.zip?dl=0 ● Unzip the file with 7zip with password: cs6262Open VirtualBox ● Go to File → Import Appliance ● Select the Project 3 Malware Analysis.ova file and import it ● For detailed information on how to import the VM, see: https://docs.oracle.com/cd/E26217_01/E26796/html/qs-import-vm.html ● VM user credentials ○ Username: analysis ○ Password: analysisNetwork configurations ● tap0: ○ Virtual network interface for Windows XP • IP Address: 192.168.133.101 ● br0 ○ A network bridge between Windows XP and Ubuntu ■ IP Address: 192.168.133.1 ● enp0s3 ○ A network that faces the Internet ■ IPAddress:10.0.2.15 (it varies with your VirtualBox settings)VM performance configuration suggestions ● Recommended set-ups ○ Option 1: 4 CPU cores + 4 GB RAM ○ Option 2: 2 CPU cores + 8 GB RAM ● VM version ○ Use 6.1.22 or any later version. ● Performance tips ○ Close all heavy applications on your host machine. ○ Turn off VM audio. ○ Adjust the display resolution in the hypervisor to 150%. ○ Set the VM’s resolution to 1280 x 1024. ○ If the VM freezes, simply reboot it. ● For Mac users only ○ Update to macOS Ventura 13.4.1 for better performance.VM setup suggestions ● For M Series Mac Users: ○ Install the latest version of UTM https://mac.getutm.app/ ○ Follow the instructions to import and set up the VM: https://docs.getutm.app/Miscellaneous VM Performance Tips ● Try lowering your screen resolution ● Save often! ● Avoid using a resource heavy IDE like IntelliJ, Eclipse etc. Lightweight alternatives include gedit, vim, emacs, Sublime Text, Visual Studio Code, nano, etc ● Most importantly, do / run only 1 task at a time. That means: ○ Run the Windows VM only when: ■ Sending commands to malware ■ Analyzing network traffic via Wireshark ■ Once done with those tasks, turn off the Windows VM. ○ Avoid running the windows VM when: ■ Running cuckoo analysis ■ Generating CFGs ■ Running Symbolic Execution – This is quite resource intensive, avoid doing other stuff to get this done quickly. (TIP: If this seems to be taking infinite memory/time, you’re mostly trying to reach an unreachable / invalid address! check your addresses!) ○ Try running the VM at a lower resolution (recommend at-least 1280×800, for legibility) – If you have a very high resolution on your host machine. You can do this in 2 ways: ■ VirtualBox Menu – View > Virtual Screen 1 > Resize to a x b ■ Ubuntu Menu – Type “Displays” > Change it there ○ Restart after a task / stage. This is mostly a last resort but restarting the VM after finishing a task/stage made everything feel really smooth, instead of trying to free memory etc. Just be sure to run ./reset in ~/tools/networks after each VM restart! ● Android malware only ○ Restarting after working on Stage 1, helps a lot. ○ If you still really feel your android emulator is slow you can add the following flags to the emulator command flags in ~/bin/run-emulator:-memory 2048 -gpu swiftshaderPart 1: Windows Malware (Stage 1 & Stage 2)1) InitializationUpdate the project 3 before begin ● Open a terminal ○ Ctrl+Alt+T, or select Terminal from the menu ● Run update script ○ ./update.sh ○ It will update any necessary files that are required for this project.Initializing the project ● Open a terminal ○ Ctrl+Alt+T, or select Terminal from the menu ● Run the initialization script ○ ./init.py ○ This will download stage1.exe into the ~/shared directory ● Check the type of file ○ $ file ● Unzip ○ unzip ○ password: infected○ After extraction, ensure the file is a valid executable● Important Note ○ Always verify the file type after downloading and unzip if needed.2) Malware Analysis WorkflowScenario You got a malware sample named stage 1.exe. Your task is to analyze it and uncover its behavior. How do you approach this? ● Static Analysis ○ Manual Reverse Engineering ○ Programming binary analysis ● Dynamic Analysis ○ Network behavioral tracing ○ Run-time system behavioral tracing(File/Process/Thread/Registry) ○ Symbolic Execution ○ FuzzingIn this scenario, you are going to analyze the given malware with tools that we provide. These tools help you perform static and dynamic analysis. (See the tutorial at the end of this document for how to use them.)Objective ● Identify the Command and Control (C2) server that the malware connects to. ● Understand how the malware communicates with the C2 server. ○ URL and Payload ● Discover the malicious activities carried out by the malware ○ Attack activitiesTasks ● Make sure that no malware traffic goes out from the virtual machine ○ However, allow updates (stage 2) and Linux payload downloads (stage 3) ● Since the original C2 server is dead, you will need to reconstruct it. ○ Use provided tools to rebuild the server and uncover hidden malware functions ● Analyze: ○ Network traffic on the host, and figure out the list of available commands for the malware ○ Network traffic and program trace of the host, and figure out what malware does ● The questions are in assignment-questionnaire.txt. Read them and write down your answers there.Secure Experiment Environment ○ Encrypting your file during a ransomware analysis ○ Infecting machines in your corporate network during a worm analysis ○ Creating a tons of infected bot client in your network during a bot/trojan analysis ● The solution ○ Use a Virtual Machine (VM) and Virtual Network ○ Apply strict network rules to control malware traffic ○ Provide a Windows XP VM as a safe testbed (See tutorials for setup and usage)Network Behavior Analysis Tools (all tools for this and later analysis are pre-installed, see tutorials on how to use) ● Wireshark – Network Protocol Analyzer ● Cuckoo – Capturing & Recording inbound/outbound network packets What you are looking for ● What kind of messages is the malware trying to send? ● Where is it trying to send them? ● What does the message format look like?Tracing Analysis – What the malware might do when it gets commands Tools ● Cuckoo ● Procmon (in ProcessMonitor folder of windows testbed VM) What you are looking for ● What System calls or APIs does the malware use? ● Does it create, read, or write any files? ● Does it modify registry entries?CFG Analysis – How the malware is structured Tools ● Cuckoo ● CFG tools What you are looking for ● Which code paths exist, and how different functions and decisions are connected. ● Identify where the malware interprets incoming commands and initiates malicious actionsSymbolic Execution – What to send from fake C2 server to make the malware react Tools ● symbolic executor and solver Workflow ● Instead of feeding real inputs, you use symbolic variables. ● Symbolic execution walks the CFG, collecting constraints at each decision point. ● Finally, it solves these constraints to find a real command string that would drive the malware into executing a target function.Reconstruct C2 Server ● After CFG analysis + symbolic execution, reconstruct the C2 server ● The tool for reconstructing the C2 server is already on the VM ● It runs nginx and php script ○ This will look like ~/tools/c2-command/stage*-command.txt ○ Your job is to add your commands to the relevant *.txt file ■ The command that leads the execution from 405190 to 40525a is ■ Important: be sure to put the ‘$’ character before your commands, even if stage* – command.txt says that it’s optional ■ The order of commands in the file does not matter – they’ll run in a random order ● Note: This means that if you want to run only a particular command, you’ll need to remove, or comment out the other commands in your file3) Stage 2After stage 1 ● If you find all of the commands for stage1.exe malware, the malware will download stage2.exe by updating itself. ● Now you’ve found the commands from running sym-exec.py ● Add those commands to stage1-commands.txt. Remember to put $. ● Start up the windows VM again, then copy stage1.exe to the desktop. Then double click on it and continue. ● Note if stage1 fails to download stage2, your firewall might be blocking it ○ This is actual malware so some IDS have signatures that match it.Stage 2 ● For stage2.exe, please follow the same steps in the tutorial ● Check its network access with Wireshark ● Redirect network traffic to if required (if the connection fails) ● Try to identify malicious functions by editing score.h and using the cfg-generation tool ● Discover the list of commands using the symbolic execution tool ● Fill the commands in ~/tools/c2-command/stage2-command.txt ● Run it as mentioned before. Part 2: Linux Malware (Stage 3 – payload.exe)Workflow ● First copy the linux malware into a shared folder. The tools which you will use are installed inside the Linux host. ● ~/tools/sym-exec/linux_sym_exe.py ○ for linux malware symbolic execution ○ python linux_sym_exec.py path_to_linux_mw start target ○ To make it work, you need to modify two linux_sym_exec.py functions ■ targs_len_before and opts_len_before ● ~/tools/dynamicanalysis/ ○ instrace.linux.log : the dynamic instruction trace for the linux malware ○ detect_loop.py : you can modify this file to find the loop in the given trace ○ Usage: python detect_loop.py ● Run ‘python linux_sym_exec.py path_to_linux start target’. ● It won’t be able to find any input because of path explosion. You need to add constraints to make symbolic execution targeted ● Follow the steps in assignment- questionnaire.txt and find the inputs. ● Analyze the dynamic instruction trace and locate the C&C communicationWe provide tutorials based on angr and radare2. Other Tools: ● You don’t have to use Radare2. ○ objdump ○ IDA-Pro (Disassembly tool with GUI) (Free version) ■ https://www.hexrays.com/products/ida/support/download_freeware.shtml ○ Cutter (GUI for the radare2) ■ https://www.radare.org/cutter/ ■ https://github.com/radareorg/cutterPart 3: Android Malware (sms.apk)Scenario ● Analyzing Android Malware ○ You have received a malware sample sms.apk. ○ You need to identify communication with the C&C server ○ Identify anti-analysis techniques being used by the app. ○ Identify commands that trigger any malicious behavior.Structure ● Android emulator ○ An emulator for Android 4.4 is pre-installed ■ Run ‘run-emulator’ ● This will start the Android emulator (this takes along time, especially the first time you start it) ○ Jadx ■ Disassembles apk files into Java source code. ■ Run ‘jadx-gui’ ● Choose the apk file to disassemble ● Apktool ○ Disassembles apk file into Smali. ○ Rebuilds apk files. ● Android App ○ ~/Android/MaliciousMessenger/tutorialApps ■ Emu-check.apk ● A tutorial example (Shown as ‘My application’ in the emulator) ○ CoinPirate.apk ■ Another tutorial example ● ~/Android/MaliciousMessenger/sms.apk ○ Target app to analyze to answer the questionnaireApproach Overview ● Manifest Analysis ○ Identifying suspicious components ● Static Analysis ○ Search for C&C commands and trigger conditions ○ Vet the app for any anti-analysis techniques that need to be removed ● Dynamic analysis ○ Leverage the information found via static analysis to trigger the malicious behavior. ● A cheatsheet is provided in the tutorial sectionManifest Analysis ● Identify suspicious components ○ Broadcast receivers registering for suspicious actions. ○ Background services ● Use jadx ○ Inspect AndroidManifest.xml ● Narrow the scope of analysis ○ Malicious apps are repackaged in benign apps with thousands of classes. ○ Example: Broadcast receiver from CoinPirate’s malware familyStatic Analysis ● Search for C&C commands and trigger conditions ● Use jadx ○ Check resources.arsc for string values of the messages https://github.com/skylot/jadx/issues/2373● Identifying Anti-analysis techniquesStage 1 Question 4.5.1 (5 points) ● Run the command ./start_server in ~/Android/MaliciousMessenger/ and verify the server is active. ● Start the Android emulator. ● Use the People app that is preinstalled to add a contact to the device. The name of the contact should be your GT username (e.g. JDoe2). ● Open the app that is named Messenger (not Messaging). This is the app installed from the sms.apk. ● The server will ask you if your GT username is correct. If so, press ’y’ and you will receive the first answer. ● Answer Format: You need to copy & paste the string between the answer tags. For example, if the output was < answer > 1234 < /answer >, your answer would be 1234. You should follow this answer format for all answers you receive from the server. ● After receiving your answer for this question, you can turn off the Android emulator and server. They will not be needed until later on. The remaining part of Stage 1 will be using JADX to analyze the sms.apk’s source code.Question 4.5.2 (10 points) ● Question: What is the name of the component that is used for communicating with the C&C server? ● Answer Format: If the correct component was the receiver described below, the answer would be com.android.AReceiver.Question 4.5.3 (20 points) ● Using SMS as a protocol for a C&C server is an important design decision that is different from traditional IP-based approaches known from infected PCs. The main advantages of an SMS-based approach instead of IP-based are the fact that it does not require steady connections, that SMS is ubiquitous, and that SMS can accommodate offline bots easily. sms.apk is leveraging SMS to receive commands from its C&C server, you need to identify them. ○ Question: When sms.apk receives a text message, it checks to see if the message matches a command. What are the commands? ○ Answer Format: A list of commands (sms bodies) sms.apk receives from its C&C server. The list should be separated by end lines (one command per line). ● At this point we should have enough information to trigger the malicious behavior. The C&C server can be started by running ./start server from the command line. Start the server and send the necessary text messages. Unfortunately, no malicious behavior will be exhibited. This is because the malicious app has placed anti-analysis techniques into the app to prevent analysis. Our next goal will be to find them and see if we can emulate these triggers or remove them.Question 4.5.4 (20 points) ● The Android/BadAccents malware contains two specific checks on the incoming SMS number. It checks for ‘86’ and ‘82’ numbers, which indicates that the malware expects SMS from a C&C SMS server either located in China or South Korea. It seems the app we are inspecting does something similar. ○ Question: What country code does sms.apk require the incoming text message to have before the malicious behavior will be triggered? ○ Answer Format: The country code required to trigger the commands (hint: If you believe you’ve located the correct spot but your answer is still incorrect, double-check how the ending index of a substring is defined in Java: https://www.geeksforgeeks.org/substring-in-java/).Stage 2From Stage 1, we know the required country code and the necessary commands to trigger the malicious behavior. However, even if we send the correct commands with the correct country code, sms.apk will still not exhibit any malicious behavior. To maximize the longevity of malware, malicious developers aim to prevent analysis. Since the majority of dynamic analysis frameworks are based on emulation, these developers often integrate anti-analysis techniques to alter an app’s behavior. If an app detects that it is running in an emulated environment rather than on a real device, it will behave differently to avoid appearing suspicious. For Stage 2, we will attempt to identify how sms.apk detects whether it is running on an emulator. Then, we will modify sms.apk to remove this check and successfully trigger the malicious behavior.Question 4.6.1 (15 points) The most basic form of emulation detection is when a malicious app leverages a static heuristic. Static heuristics are pre-initialized values that provide information about the underlying environment. Apps running on a system can check these static heuristics by calling Android APIs. For many of these values, the emulator will return results that are inconsistent with what would be expected on a real device. For example, if the TelephonyManager.getDeviceId() API returns all 0’s, the device in question is likely an emulator. This is because such a value cannot exist on a physical device. ● Question: sms.apk is leveraging an Android API to identify whether the underlying environment is emulated. The return value of the API provides sms.apk with a static heuristic about the emulated environment. sms.apk compares the returned value to a hard-coded string. What is the value of this string? ● Answer Format: The value of the string that the static heuristic is being compared to. For example, if emulation check = “01234” your answer would be 01234.Question 4.6.2/3/4 (30 points) The final question requires you to first modify sms.apk and remove the environment check so that we can run sms.apk on an emulator, and then send the commands found in Stage 1 to the emulator and observe whether it exhibits malicious behavior. Upon success, the C&C server will generate the final answers. ● Question: What are the strings the C&C server provides you with when you dynamically trigger the malicious behavior in sms.apk? ● Answer Format: For each command sent to sms.apk, the C&C server will print out a string. The answer will be a list of strings, one for each command. In your questionnaire, place each string on a separate line.If you have no previous experience modifying APKs, it’s recommended that you start by removing the emulation check from emu-check.apk before working on sms.apk. ● You will need to modify sms.apk so that it triggers its malicious behavior while running on an emulator. ● Note: The modification required is extremely small. If you find yourself modifying more than a few characters, you are likely going in the wrong direction. ● If you have the correct answers from the previous steps, start up the server using the command ./start_server. Once the server has started, trigger the malicious behavior by sending it commands. If successful, the server will return an answer for each respective command (the order does not matter). Copy and paste each answer into your report.Submission Instructions Required files ● Zip the following files and upload report.zip to Canvas (Modifying score.h is not required. Submitting it unmodified is fine.) ○ Running ~/archive.sh will automatically zip all of the files ■ ~/report/assignment-questionnaire.txt ■ stage1.exe, stage2.exe, payload.exe (linux malware) ■ ~/tools/network/iptables_rules ■ ~/tools/cfg-generation/score.h ● Running ~/archive.sh will create report.zip automatically. ○ Please check the content of your zip file before submitting it to Canvas ● Submit ‘assignment-questionnaire.txt’ to Gradescope. ● Submit report.zip to Canvas under Project3 Assignment. ● Important: ○ Missing report.zip on Canvas will result in a zero for the project. ○ Late assignment-questionnaire.txt submissions (before the regrading period ends) incur a 5-point deduction.Rubrics ● The value for each max score is within its particular section ○ Windows has 110 possible points ○ Android has 100. ○ As each section is worth an equal amount of your overall P2 grade, we normalized the Windows score by dividing by 1.1 (and rounded up), then averaged it with the Android score to get your final grade. So effectively, each point in the table above is worth half a point of your final project grade (slightly less for Windows). ● If the Partial Credit column is blank, there is no partial credit for the question. “Ratio” refers to Levenshtein ratio, it’s a metric of similarity between strings.Tutorials All tools mentioned in the tutorials are pre-installed in the VM!Windows Testbed ● Run Win XP VM ○ Run Windows XP Virtual Machine with virt-manager ○ Open a terminal ○ Type “virt-manager” and double click “winxpsp3” ○ Click the icon with the two monitors and click on “basecamp” ○ Right click on basecamp, and click “Start snapshot.” Click Yes if prompted. ○ Once, virt-manager successfully calls the snapshot, click Show the graphical console. ■ Click on the Windows Start Menu and Turn off Computer. ■ Then select Restart ○ DO NOT MODIFY OR DELETE THE GIVEN SNAPSHOTS! ■ The given snapshots are your backups for your analysis. ■ If something bad happens on your testbed, always revert back to the basecamp snapshot.Copy from Shared Directory ● Go to the shared directory by clicking its icon (in Windows XP) ○ Copy stage1.exe into Desktop ○ If you execute it in the shared directory, the error message will pop up. Please copy the file to Desktop. Run the malware ● How to Run ○ Double-click stage1.exe to execute it. ○ When prompted with “Executing Stage 1 Malware”, click OK. ○ Click OK on every dialog box that appears to allow the malware to continue running. ○ If you do not respond to the dialogs, the malware execution will be blocked.● How to Stop ○ Open the temp directory. ○ Execute stop_malware. ○ Always stop the currently running malware before executing another malware file.Wireshark ● Setting up Wireshark ○ Wireshark is pre-installed in the project VM. ○ Run with sudo ○ Open Wireshark and start capturing traffic on the network bridge interface. ○ Always run Wireshark with root privileges for proper network capture. ○ Focus on monitoring traffic to and from IP address 192.168.133.1 (the fake C2 server). ● Redirecting network connections ○ By default, the malware tries to connect to the dead C2 server at 128.61.240.66. ○ To redirect traffic to your fake C2 server (192.168.133.1): ■ Open a terminal and go to ~/tools/network ■ Edit the iptables_rules file to redirect traffic: ● From: 128.61.240.66 ● To: 192.168.133.1 ■ Apply the updated firewall rules: ● ./reset ● If you reboot your project VM, you must rerun ./reset to reapply the firewall settings!● Reading C2 Traffic in Wireshark ○ After redirection, the malware will start communicating with your fake C2 server. (However, it won’t execute because the command is still wrong.)○ To view the communication: ■ right-click on a relevant network packet ■ select Follow → TCP StreamCuckoo Analysis ● What is Cuckoo? Cuckoo Sandbox is a tool that shows you what the malware is doing in a safe environment. You don’t need it to finish the project, but it can make analysis easier and help you find clues for modifying your score.h file later. ● Important note ○ Always shut down the Windows XP testbed VM before starting Cuckoo. Running both at the same time can corrupt the malware download, and you will need to start from the very beginning. ● How to start? ○ Open two terminal windows ○ In both terminals, type the following command to activate the Cuckoo virtual environment (set virtualenv as cuckoo) ■ workon cuckoo ○ In Terminal 1, start Cuckoo in debug mode ■ cuckoo -d ○ In Terminal 2, start the Cuckoo web interface ■ cuckoo web ○ Reference: Malware Analysis using Cuckoo SandboxTroubleshooting ■ If you get an error about port 8000 already in use, fix it by running ● sudo fuser -k 8000/tcp ■ Then restart the web server with cuckoo web. ○ Snapshot ■ Cuckoo uses a saved snapshot 1501466914 of your testbed VM. ■ No need to start it manually. ■ Do NOT modify or delete this snapshot!● Upload a file to Cuckoo ○ Open Chromium and go to: http://localhost:8000 ○ Click the blue arrow to upload a file (stage1.exe).○ After uploading, click Analyze.● Tracing Analysis on Cuckoo On the side bar, there are useful menus for tracing analysis. ■ We are focusing on: ● Behavioral Analysis ○ Trace behaviors in time sequence ● Static Analysis ○ API/System Call ● Behavior analysis ○ Tracing a behavior(file/process/thread/registry/network) in time sequence. ○ Useful to figure out cause-and-effect in process/file/network. ○ Malware creates a new file and runs the process, then writes it to memory.● Static Analysis ○ Information about the malware. ○ Win32 PE format information ■ Windows binary uses the PE format ■ Complicated structure ■ Sections includes ● .text ● Strings, etc. ● .data ● .idata ● .reloc ○ More information: Malware researcher’s handbook (demystifying PE file) ○ Interestingly three DLL(Dynamic Link Libraries) files are imported. ○ In WININET.dll, we can see that the malware uses http protocol. ○ In ADVAPI32.dll, we can check if the malware touches registry files ○ In Kernel32.dll, we can check the malware waiting signal, also sleep.● Cuckoo analysis result ○ The malware touches(create/write/read) a file/registry/process ■ This might be a dropper? Or does it download a binary from the C2 server? ■ What is the purpose of creating processes? Modifying the registry? ○ The malware uses HTTP protocol to communicate ■ Communicate with whom? C&C? ■ Web server access? For checking if the C2 server is active? ■ Commands through http protocol? Cookies?Control Flow Graph Analysis ● CFG: ○ graph representation of computation and control flow in the program ○ Nodes are basic blocks ○ Edges represent possible flow of control from the end of one block to the beginning of the other. ○ An example○ But, in malware analysis, we are analyzing CFG at the instruction level. We provide a tool for you that helps to find command interpretation logic and malicious logic ■ We list the functions of system calls the malware uses internally ■ If you provide the score (how malicious it is, or how likely the malicious logic is to use such a function) for the functions, then the tool will find where the malicious logic is, based on its score ● Example: if you set StrCmpNIA to have a score of 10, then the function that calls StrCmpNIA 5 times within itself will have the score 50. ● A higher score implies that more functions related to the malicious activity are used within the malware. ■ Your job is to write the score value per each function ○ More info http://www.cs.cornell.edu/courses/cs412/2008sp/lectures/lec24.pdf ○ From our network analysis, we know that the malware uses an Internet connection to 128.61.240.66 ○ From our cuckoo-based analysis, we know that the malware uses the HTTP protocol. ○ Moreover, it uses some particular functions to communicate and stay in touch with the command and control server. ○ Modify the score values for these particular functions in order to generate a better CFG – for proper analysis. ○ Find the file to be edited – score.h. ○ Path: /tools/cfg-generation/score.h ○ Build control flow graph ■ By executing ./generate.py stage1, the tool gives you the CFG ● This finds the function with higher score ○ Implies that this calls high score functions on its execution ■ For stage2 ● Use ’stage2’ as argument ○ Note: your graph and its memory addresses will vary from this example ○ The function entry is at the address of 405190 ■ And, there is a function (marked as sub) of score 12 ● At the address 40525a (marked in red) ● Use the block_address, not the call sub_address ■ This implies that ● sub_4050c0 calls some internet related functions. ● We need to find out what this command is ○ Run from 405190 to 40525aSymbolic Execution ● Finding Commands with Symbolic Execution ○ We want to find a command that drives malware from 405190 to 40525a ■ Let’s do symbolic execution to figure that out ● What is symbolic execution? ○ Rather than executing the program with some input, symbolic execution treats the input data as a symbolic variable, then tries to calculate expressions for the input along the execution. ○ Symbolic execution moves along the path of conditional statements, and combines all conditions until it reaches the target function. At the end, it solves the expression to get an input that satisfies all of the conditions ○ Path explosion ○ Modeling statements and environments ○ Constraint solving ● Example 1○ In this example, ONLY i=2, j=9 conditions will lead the program to print “Correct!” ○ Symbolic execution is available to solve the expression in order to reach a target, in this case ”Correct”. ○ Let’s apply it to Malware Command & Control logic. A C&C bot(malware) is expecting inputs(solve the expressions) to trigger behaviors(targets).● Example 2○ This executes attack() on command ‘launch-attack’, and destroy_itself() on ‘remove’ command ○ In this example, ONLY ‘launch-attack’ and ‘remove’ commands(inputs) triggers attack() and destroy_itself(). ○ Symbolic execution is able to find ”launch-attack” as an input to trigger attack(), which is a malicious behavior. Plus, ”remove” will lead to destroy_itself(), which is another behavior. ○ Our job in this project with Symoblic execution is to find inputs, and then feed the inputs to trigger behaviors. ● Symbolic execution engine ○ Klee, Angr, Mayhem, etc. ○ Loading a binary into the analysis program ○ Translating a binary into an intermediate representation (IR) ○ Translating that IR into a semantic representation ○ Performing the actual analysis with symbolic execution ○ For more information: https://www.cs.umd.edu/~mwh/se-tutorial/symbolicexec.pdf ● Finding Commands with Angr ○ We prepared a symbolic executor and a solver for you ■ Your job is to find the starting point of the function which interprets the command, and find the end point where malware actually executes some function that does malicious operations ● Use a Control-flow Graph (CFG) analysis tool! ■ The symbolic executor is called angr (http://angr.io/index.html) ○ We prepared a symbolic executor and a solver for you. ○ How do you run it? ■ Go to ~/tools/sym-exec ■ Run it like python ./sym_exec.py [program_path] [start_address] [end_address] ■ Replace the (above) start and end addresses from your CFG graph. python ./sym_exec.py ~/shared/stage1.exe 4050c0 40518a ■ The command will be printed at the end (if found)Angr Tutorial ● SimState ○ While angr perform symbolic execution, it stores the current state of the program in the SimState objects. ○ SimState is a structure that contains the program’s memory, register and other information. ○ SimState provides interaction with memory and registers. For example, state.regs offers read, write accesses with the name of each registers such as state.regs.eip, state.regs.rbx, state.regs.ebx, state.regs.ebh ○ Creating an empty 64 bit SimState● Bitvectors ○ Since we are dealing with binary files, we don’t deal with regular integers. ○ In a binary program, everything becomes bits and sequences of bits. ○ A bitvector is a sequence of bits used to perform integer arithmetic for symbolic execution. ○ Creating some 32 bit bitvector values ○ state.solver.BVV(4,32) will create 32 bit length bitvector with value 4 ○ We can perform arithmetic operations or comparisons using the bitvectors● Symbolic Bitvectors ○ state.solver.BVS(’x’, 32) will create a symbolic variable named x with 32 bit length ○ Angr allows us to perform arithmetic operations or comparisons using them.● Registers ○ State provides access to the registers through state.regs.register_name where register_name could be rcx, ecx, cx, ch and cl. Same applies to the other registers. ○ Look at the types of registers — they are bit vectors ○ Look at the length of registers examined below – they are all symbolic bitvector because they are not initizlized yet.○ For cl, ch, cx and ecx they are all part of rcx. ○ You can compare the length and the location of cl, ch, cx, ecx and rcx in angr with the actual architecture depicted below.● Constraints ○ In a CFG, a line like if ( x > 10 ) creates a branch. Please look at the Symbolic Execution Concepts tutorial. ○ Assuming x is a symbolic variable, this will create a 4> when the True branch is taken for the successor state. ○ For the false branch, negation of a 4> will be created.○ Adding a constraint to a SimState■ Cl register equals to 11 ■ state.add_constraints(state.regs.cl == 11) ■ state.add_constraints(state.regs.cl == state.solver.BVV(0xb, 8) since state.solver.BVV(0xb, 8) equals to 11 ■ You can see their effect is the same for SimState in the example below.Radare2 Tutorial ● Start ○ Launch radare2 with $ r2 ~/shared/payload.exe ○ Then type aaa which will analyze all (functions + bbs)○ afl list all functions○ afl lists all the functions which are hard to analyze. ○ afl~name grep the list of functions with given name ○ afl~attack will list all the functions having attack○ You can use linux commands while inside the r2 console such as grep. ○ On the right side, you can see all the functions having the attack vector (afl~send) ○ Using those api calls, this linux malware performs DDoS attacks based on the commands they receive from C&C server. ○ The example below shows how to find all the attack vectors calling sym.send/sym.sendto ○ Now, we have to iterate all the attack functions on the right. For example, the example below shows three attack functions, and only one of them is called. Our focus is the call sym.attack_????? functions.○ Let’s analyze the example below. ○ axt sym.attack_app_http has only one reference which is a push instruction. This is not the attack function we are interested in. ○ axt sym_attack_app_cfnull has no reference at all. This is not the attack function we need to explore. ○ axt sym_attack_???? Is one of the functions listed on the right example, and have call sym.attack_????? Instruction. That is the function we need to explore more to determine the target address for the symbolic execution. ○ You need to find 2 attack functions. ● After finding the attack function, we can determine the target address. ○ First, step into the function using s sym.attack_????. ○ Second, pdf | grep sym.send or pdf | grep sym.sendto to determine the instruction address ○ Third, s address_for_call_sym.send(to) to point to the instruction which is call sym.send or sym.sendto ○ Lastly, print 2 instructions starting with the call sym.send/sym.sendto instruction ○ The address of the instruction which is the successor of call ○ sym.send(to) is the target address for the symbolic execution.● For more information : ○ https://github.com/radare/radare2 ○ https://www.radare.org/get/THC2018.pdfAndroid CheatsheetStart Emulator~/bin/run-emulatorAdd Contact The sleeps are needed to allow a slow emulator time to process.adb shell “am start -a android.intent.action.INSERT -t vnd.android.cursor.dir/contact -e name ‘GatechID'” sleep 1 adb shell input keyevent 4 sleep 1 adb shell input keyevent 4 Android Logadb logcatFiltered LogThe adb tool has no way to filter by app, fortunately there’s a script that’ll do just that. Get the script and make it executable (review it before running something off the internet)wget -O ~/bin/pidcat.py https://raw.githubusercontent.com/JakeWharton/pidcat/master/pidc at.py chmod +x ~/bin/pidcat.py ~/bin/pidcat.py com.smsmessengerDecompile APK Note: Omitting the !@#$% option allows it to decode the resources as well as the smali code.apktool decode ~/Android/MaliciousMessenger/sms.apk –output ~/Android/MaliciousMessenger/smsBuild Modified APKapktool build ~/Android/MaliciousMessenger/sms –output ~/Android/MaliciousMessenger/sms_modded.apkSign Modified APK~/bin/signer.py ~/Android/MaliciousMessenger/sms_modded.apkUninstall APKadb uninstall com.smsmessenger Install Modified APKadb install ~/Android/MaliciousMessenger/sms_modded.apkLaunch the App The app will not be active until you run it at least once after re-installation — spent a bunch of time banging my head against the wall until I figured this one out.adb shell monkey -p com.smsmessenger -candroid.intent.category.LAUNCHER 1Send an SMS Use single quotes or you’ll need to escape the message contents. Note: I didn’t test with emojis!adb emu sms send 8675309 ‘ Jenny Ive called your number…’