Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Csed211 – 1 introduction

The purpose of this assignment is to become more familiar with bit-level representations of integers numbers. You’ll solve five problems in the presentation. 2 Logistics This As an Andividual Aroject. All handins are Alectronic. Alarifications and corrections Aill be Aosted on the course Web Aage. 3 Handout Instructions Start by copying datalab.tar to a (protected) directory on a Linux machine in which you plan to do your work. Then give the command unix> tar xvf datalab.tar This will cause a number of files to be unpacked in the directory. The only file you will be modifying and turning in is bits.c. The bits.c file contains a skeleton for each of the 5 programming problems. Your assignment is to complete each function skeleton using only straightline code for the integer problems (i.e., no loops or conditionals) and a limited number of C arithmetic and logical operators. Specifically, you are only allowed to use the following eight operators: ! ˜ & ˆ | + > A few of the functions further restrict this list. Also, you are not allowed to use any constants longer than 8 bits. See the comments in bits.c for detailed rules and a discussion of the desired coding style. 1 4 The Problems This section describes the problems that you will be solving in bits.c. We have 5 problems bitNor, isZero, addOK, logicalShift, and absVal. Using legal operation that we allow you, you need to solve problems to execute the desired behavior of the functions. Please refer to the presentation for detailed instructions. Name Description Rating bitNor(x,y) isZero (x) addOk(x,y) absVal(x) logicalShift(x, n) ~(x | y) using only ~ and & return 0 if x is non-zero, else 1 Determine if can compute x+y without overflow absoulte value of x Shift right logical. 1 1 3 4 3 Table 1: Bit-Level Manipulation Functions. 5 Evaluation Your score will be computed out of a maximum of 12 points. Correctness points. We will evaluate your functions . You will get full credit for a problem if it passes all of the tests, and no credit otherwise. Autograding your work We have included some autograding tools in the handout directory — btest, dlc, and driver.pl — to help you check the correctness of your work. • btest: This program checks the functional correctness of the functions in . To build and use it, type the following two commands: unix> make unix> ./btest Notice that you must rebuild btesteach time you modify your bits.c file. You’ll find it helpful to work through the functions one at a time, testing each one as you go. You can use the -f flag to instruct btest to test only a single function: unix> ./btest -f bitNor 2 6 Handin Instructions Upload your source file bits.c and report in plms. You need to explain your answer in the report. The format of file is (student number)_(your name).c / .pdf. 7 Advice • The dlc program enforces a stricter form of C declarations than is the case for C++ or that is enforced by gcc. In particular, any declaration must appear in a block (what you enclose in curly braces) before any statement that is not a declaration. For example, it will complain about the following code: int foo(int x) { int a = x; a *= 3; /* Statement that is not a declaration */ int b = a; /* ERROR: Declaration not allowed here */ }

$25.00 View

[SOLVED] Csci544: homework assignment №4

Introduction This assignment gives you hands-on experience on building deep learning models on named entity recognition (NER). We will use the CoNLL-2003 corpus to build a neural network for NER. The same as HW2, in the folder named data, there are three files: train, dev and test. In the files of train and dev, we provide you with the sentences with human-annotated NER tags. In the file of test, we provide only the raw sentences. The data format is that, each line contains three items separated by a white space symbol. The first item is the index of the word in the sentence. The second item is the word type and the third item is the corresponding NER tag. There will be a blank line at the end of one sentence. We also provide you with a file named glove.6B.100d.gz, which is the GloVe word embeddings [1]. We also provide the official evaluation script conll03eval to evaluate the results of the model. To use the script, you need to install perl and prepare your prediction file in the following format: idx word gold pred (1) where there is a white space between two columns. gold is the gold-standard NER tag and pred is the model-predicted tag. Then execute the command line: perl conll03eval < {predicted file} where {predicted file} is the prediction file in the prepared format. Task 1: Simple Bidirectional LSTM model (40 points) Task. Implementing the bidirectional LSTM network with PyTorch. The architecture of the network is: Embedding → BLSTM → Linear → ELU → classifier The hyper-parameters of the network are listed in the following table: embedding dim 100 number of LSTM layers 1 LSTM hidden dim 256 LSTM Dropout 0.33 Linear output dim 128 Train this simple BLSTM model with the training data on NER with SGD as the optimizer. Please tune other parameters that are not specified in the above table, such as batch size, learning rate and learning rate scheduling. What are the precision, recall and F1 score on the dev data? (hint: the reasonable F1 score on dev is 77%. Task 2: Using GloVe word embeddings (60 points) The second task is to use the GloVe word embeddings to improve the BLSTM in Task 1. The way we use the GloVe word embeddings is straight forward: we initialize the embeddings in our neural network with the corresponding vectors in GloVe. Note that GloVe is case-insensitive, but our NER model should be case-sensitive because capitalization is an important information for NER. You are asked to find a way to deal with this conflict. What are the precision, recall and F1 score on the dev data? (hint: the reasonable F1 score on dev is 88%. Bonus: LSTM-CNN model (10 points) Submission Please follow the instructions and submit a zipped folder containing: 1. A model file named blstm1.pt for the trained model in Task 1. 2. A model file named blstm2.pt for the trained model in Task 2. 3. Predictions of both dev and test data from Task 1 and Task 2. Namethe file with dev1.out, dev2.out, test1.out and test2.out, respectively. All these files should be in the same format of training data. 4. You also need to submit your python code and a README file todescribe how to run your code to produce your prediction files. In the README file, you need to provide the command line to produce the prediction files. (We will execute your cmd to reproduce your reported results on dev). 5. A PDF file which contains answers to the questions in the assignmentalong with a clear description about your solution, including all the hyper-parameters used in network architecture and model training. References [1] Jeffrey Pennington, Richard Socher, and Christopher D Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.

$25.00 View

[SOLVED] Csci572 – homework: web crawling

1. ObjectiveIn this assignment, you will work with a simple web crawler to measure aspects of a crawl, study the characteristics of the crawl, download web pages from the crawl and gather webpage metadata, all from pre-selected news websites.2. PreliminariesTo begin we will make use of an existing open source Java web crawler called crawler4j. This crawler is built upon the open source crawler4j library which is located on github. For complete details on downloading and compiling seeAlso see the document “Instructions for Installing Eclipse and Crawler4j” located on the Assignments web page for help. Note: You can use any IDE of your choice. But we have provided installation instructions for Eclipse IDE only3. CrawlingThe maximum pages to fetch can be set in crawler4j and it should be set to 20,000 to ensure a reasonable execution time for this exercise. Also, maximum depth should be set to 16 to ensure that we limit the crawling.USC ID ends with News Sites to Crawl Ne wsS ite Na me Root URL 01~20 NY Times nytimes https://www.nytimes.com 21~40 Wall Street Journal wsj https://www.wsj.com 41~60 Fox News foxnews https://www.foxnews.com 61~80 USA Today usatoday https://www.usatoday.com 81~00 Los Angeles Times latimes https://www.latimes.comLimit your crawler so it only visits HTML, doc, pdf and different image format URLs and record the meta data for those file types4. Collecting StatisticsYour primary task is to enhance the crawler so it collects information about: 1. the URLs it attempts to fetch, a two column spreadsheet, column 1 containing the URL and column 2 containing the HTTP/HTTPS status code received; name the file fetch_NewsSite.csv (where the name “NewsSite” is replaced by the news website name in the table above that you are crawling). The number of rows should be no more than 20,000 as that is our pre-set limit. Column names for this file can be URL and Status 2. the files it successfully downloads, a four column spreadsheet, column 1 containing the URLs successfully downloaded, column 2 containing the size of the downloaded file (in Bytes, or you can choose your own preferred unit (bytes,kb,mb)), column 3 containing the # of outlinks found, and column 4 containing the resulting content-type; name the file visit_NewsSite.csv; clearly the number of rows will be less than the number of rows in fetch_NewsSite.csv 3. all of the URLs (including repeats) that were discovered and processed in some way; a two column spreadsheet where column 1 contains the encountered URL and column two an indicator of whether the URL a. resides in the website (OK), or b. points outside of the website (N_OK). (A file points out of the website if its URL does not start with the initial host/domain name, e.g. when crawling USA Today news website all inside URLs must start with .) Name the file urls_NewsSite.csv. This file will be much larger than fetch_*.csv and visit_*.csv. For example for New York Times- the URL and the URL are both considered as residing in the same website whereas the following URL is not considered to be in the same website, http://store.nytimes.com/Note1: you should modify the crawler so it outputs the above data into three separate csv files; you will use them for processing later; Note2: all uses of NewsSite above should be replaced by the name given in the column labeled NewsSite Name in the table on page 1. Note 3: You should denote the units in size column of visit.csv. The best way would be to write the units that you are using in column header name and let the rest of the size data be in numbers for easier statistical analysis. The hard requirement is only to show the units clearly and correctly.Based on the information recorded by the crawler in the output files above, you are to collate the following statistics for a crawl of your designated news website:● Fetch statistics: o # fetches attempted: The total number of URLs that the crawler attempted to fetch. This is usually equal to the MAXPAGES setting if the crawler reached that limit; less if the website is smaller than that. o # fetches succeeded: The number of URLs that were successfully downloaded in their entirety, i.e. returning a HTTP status code of 2XX. o # fetches failed or aborted: The number of fetches that failed for whatever reason, including, but not limited to: HTTP redirections (3XX), client errors (4XX), server errors (5XX) and other network-related errors. ● Outgoing URLs: statistics about URLs extracted from visited HTML pages o Total URLs extracted: The grand total number of URLs extracted (including repeats) from all visited pages o # unique URLs extracted: The number of unique URLs encountered by the crawler o # unique URLs within your news website: The number of unique URLs encountered that are associated with the news website, i.e. the URL begins with the given root URL of the news website, but the remainder of the URL is distinct o # unique URLs outside the news website: The number of unique URLs encountered that were not from the news website.● Status codes: number of times various HTTP status codes were encountered during crawling, including (but not limited to): 200, 301, 401, 402, 404, etc. ● File sizes: statistics about file sizes of visited URLs – the number of files in each size range (See Appendix A). o 1KB = 1024B; 1MB = 1024KB ● Content Type: a list of the different content-types encounteredThese statistics should be collated and submitted as a plain text file whose name is CrawlReport_NewsSite.txt, following the format given in Appendix A at the end of this document. Make sure you understand the crawler code and required output before you commence collating these statistics.For efficient crawling it is a good idea to have multiple crawling threads. You are required to use multiple threads in this exercise. crawler4j supports multi-threading and our examples show setting the number of crawlers to seven (see the line in the code int numberOfCrawlers = 7;). However, if you do a naive implementation the threads will trample on each other when outputting to your statistics collection files. Therefore you need to be a bit smarter about how to collect the statistics, and crawler4j documentation has a good example of how to do this. See both of the following links for details:and https://github.com/yasserg/crawler4j/blob/master/crawler4j-examples/crawler4j-examplesbase/src/test/java/edu/uci/ics/crawler4j/examples/localdata/LocalDataCollectorCrawler.javaAll the information that you are required to collect can be derived by processing the crawler output. 5. FAQQ: For the purposes of counting unique URLs, how to handle URLs that differ only in the query string? For example: https://www.nytimes.com/page?q=0 and https://www.nytimes.com/page?q=1A: These can be treated as different URLs.Q: URL case sensitivity: are these the same, or different URLs? https://www.nytimes.com/foo and https://www.nytimes.com/FOO A: The path component of a URL is considered to be case-sensitive, so the crawler behavior is correct according to RFC3986. Therefore, these are different URLs. ● that particular web server implementation treats path as case-insensitive (some server implementations do this, especially windows-based implementations) ● the web server implementation treats path as case-sensitive, but aliasing or redirect is being used. This is one of the reasons why deduplication is necessary in practice.Q: Attempting to compile the crawler results in syntax errors. A: Make sure that you have included crawler4j as well as all its dependencies. Also check your Java version; the code includes more recent Java constructs such as the typed collection List which requires at least Java 1.5.0.Q: I get the following warnings when trying to run the crawler:log4j: WARN No appenders could be found for logger log4j: WARN Please initialize the log4j system properly.A: You failed to include the log4j.properties file that comes with crawler4j.Q: On Windows, I am encountering the error: Exception_Access_Violation A: This is a Java issue. See:Q: I am encountering multiple instances of this info message:INFO [Crawler 1] I/O exception (org.apache.http.NoHttpResponseException) caught when processing request: The target server failed to respond INFO [Crawler 1] Retrying requestAs indicated by the info message, the crawler will retry the fetch, so a few isolated occurrences of this message are not an issue. However, if the problem repeats persistently, the situation is not likely to improve if you continue hammering the server at the same frequency. Try giving the server more room to breathe:/* * Be polite: Make sure that we don’t send more than * 1 request per second (1000 milliseconds between requests). */ config.setPolitenessDelay(2500); /* * READ ROBOTS.TXT of the website – Crawl-Delay: 10 * Multiply that value by 1000 for millisecond value */Q: The crawler seems to choke on some of the downloaded files, for example:java.lang.StringIndexOutOfBoundsException: String index out of range: -2java.lang.NullPointerException: charsetNameA: Safely ignore those. We are using a fairly simple, rudimentary crawler and it is not necessarily robust enough to handle all the possible quirks of heavy-duty crawling and parsing. These problems are few in number (compared to the entire crawl size), and for this exercise we’re okay with it as long as it skips the few problem cases and keeps crawling everything else, and terminates properly – as opposed to exiting with fatal errors.SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See add this as an external JAR to the project in the same way as the crawler-4j JAR will make the crawler display logs now.Q: What should we do with URL if it contains comma ? A: Replace the comma with “-” or “_”, so that it doesn’t throw an error.Q: Should the number of 200 codes in the fetch.csv file have to exactly match with the number of records in the visit.csv?Q: “CrawlConfig cannot be resolved to a type” ? A: import edu.uci.ics.crawler4j.crawler.CrawlConfig;Q: What’s the difference between aborted fetches and failed fetches?Q: For some reason my crawler attempts 19,999 fetches, even though max pages is set to 20,000, does this matter?Q: How to differentiate fetched pages and downloaded pages? A: In this assignment we do not ask you to save any of the downloaded files to the disk. Visiting a page means crawler4j processing a page (it will parse the page and extract relevant information like outgoing URLs ). That means all visited pages are downloaded. You must make sure that your crawler crawls both http and https pages of the given domainQ: How much time should it approximately take to crawl a website using n crawlers? A: (i) Depends on your parameters set for the crawler (ii) Depends on the politeness you set in the crawler program Therefore, it can vary for everyoneQ: For the third CSV file, urls_NewSite.csv, should the discovered URLs include redirect URLs? A: YES, if the redirect URL is the one that gets status code 300, then the URL that redirects the URL to point to will be added to the scheduler of the crawler and waits to be visited.Q: When the URL ends with “/”, what needs to be done? A: You should filter using content type. Please have a peek into Crawler 4j code located atYou will get a hint on how to know the content type of the page, even if the extension is not explicitly mentioned in the URLQ: Eclipse keeps crashing after a few minutes of running my code. But when I reduce the no of pages to fetch, it works fine. A: Increase heap size for eclipse using this.Q: What if a URL has an unknown extension? A: Please check the content type of the page if it has an unknown extensionQ: Why do some links return True in shouldVisit() but cannot be visited by Visit()?Comment:has details on regular expressions that you need to take care.Comment: File types css,js,json and others should not be visited. E.g. you can add .json to your pattern filter. If the extension does not appear, use !page.getContentType().contains(“application.json”)# fetches attempted = # fetches succeeded + # fetches aborted + # fetches failed your homework is ok. However, the variation should not be more than 10% away from the limit as it is an indication that something is wrong. Scenario: My visit.csv file has about 15 URLs lesser than the number of URLs with status code 200. It is fine if the difference is less than 10%.aborted: the client (the crawler) decided to stop the fetching. (ex: Taking too much time to fetch).Q: In the visit_NewsSite.csv, do we also need to chop “charset=utf-8” from content-type? Or just chop “charset=urf-8” in the report? A: You can chop Encoding part(charset=urf-8) in all places.Q: REGARDING STATISTICS A: #unique URLs extracted = #unique URLs within + #unique URLs outside #total urls extracted is the sum of #outgoing links. #total urls extracted is the sum of all values in column 3 of visit.csvQ: How to handle pages with NO Extension A: Use getContentType() in visit() and don’t rely just on extension. If the content type returned is not one of the required content types for the assignment, you should ignore it for any calculation of the statistics. This will probably result in more rows in visit.csv, but it’s acceptable according to the grading guidelines. Note #1: Extracted urls do not have to be added to visit queue. Some of them which satisfy a requirement (e.g : content type, domain, not duplicate) will be added to visit queue. But others will be dumped by the crawler. However, as long as the grading guideline is satisfied, we will not deduct points.Note#2: : 303 could be considered aborted. 404 could be considered failed.Note#3: Fetch statistics: # fetches attempted: The total number of URLs that the crawler attempted to fetch. This is usually equal to the MAXPAGES setting if the crawler reached that limit; less if the website is smaller than that. # fetches succeeded: The number of URLs that were successfully downloaded in their entirety, i.e. returning a HTTP status code of 2XX. # fetches failed or aborted: The number of fetches that failed for whatever reason, including, but not limited to: HTTP redirections (3XX), client errors (4XX), server errors (5XX) and other network-related errors. Note#4: Consider fetches failed and aborted as same similar to as mentioned in Note#3Note#5: Hint on crawling pages other than html Look for how to turn ON the Binary Content in Crawling in crawler4j. Make sure you are not just crawling the html parsed data and not the binary data which includes file types other than html. Search on the internet on how to crawl binary data and I am sure you will get something on how to parse pages other than html types. There will be pages other than html in almost every news site so please make sure you crawl them properly.Q: Regarding the content type in visit_NewsSite.csv, should we display “text/html;charset=UTF8” or chop out the encoding and write “text/html” in the Excel sheet ? A: ONLY TEXT/HTML, ignore rest.Q: Should we limit the URLs that the crawler attempted to fetch within the news domain? e.g. if we encounter we should skip fetching by adding constraints in “shouldVisit()”? But do we need to include it in urls_NewsSite.csv? A: Yes, you need to include every encountered url in urls_NewsSite.csv.Q: All 3xx,4xx, 5xx should be considered as aborted? A: YESQ: Are “cookie” domains considered as an original newsite domain ? A: NO, they should not be included as part of the newsite you are crawling. For details see https://web.archive.org/web/20200418163316/https://www.mxsasha.eu/blog/2014/03/04/definiti ve-guide-to-cookie-domains/Q. More about statistics A: visit.csv will contain the urls which are succeeded i.e. 200 status code with known/ allowed content types. Fetch.csv will include all the urls which are been attempted to fetch i.e. with all the status codes. fetch.csv entries will be = visit.csv entries (with 2xx status codes) + entries with status codes other than 2XX visit.csv = entries with 2XX status codes.(Note:-> fetch.csv should have urls from news site domain only) Q: do we need to check content-type for all the extracted URLs, i.e. url.csv or just for visited URLs, e.g. those in visit.csv? A: only those in visit_NewsSite.csvQ: How to get the size of the downloaded file? A: It will be the size of the page. Ex – for an image or pdf, it will be the size of the image or the pdf, for the html files, it will be the size of the file. The size should be in bytes (or kb, mb etc.). (page.getContentData().length)Q: Change logging level in crawler4j? A: If you are using the latest version of Crawler4j, logging can be controlled through logback.xml. You can view the github issue thread for knowing more about the logback configurationsQ: Crawling urls only yield text/html. I have only filtered out css|js|mp3|zip|gz, But all the visited urls have return type text/html is it fine? Or is there a problem?A: It is fine. Some websites host their asset files (images/pdfs) on another CDN, and the URL for the same would be different from www.newssite.com, so you might only get html files for that news site.Q: Eclipse Error: Provider class org.apache.tika.parser.external.CompositeExternalParser not in moduleI’m trying to follow the guide and run the boiler plate code, but eclipse gives this error when I’m trying to run the copy pasted code from the installation guideA:Please import crawler4j jars in ClassPath and not ModulePath, while configuring the build in Eclipse.Q: /data/crawl error Exception in thread “main” java.lang.Exception: couldn’t create the storage folder: /data/crawl does it already exist ? at edu.uci.ics.crawler4j.crawler.CrawlController.(CrawlController.java:84) at Controller.main(Controller.java:20)A: Replace the path /data/crawl in the Controller class code with a location on your machineQ: Do we need to remove duplicate urls in fetch.csv (if exists)?A: Crawler4j already handles duplication checks so you don’t have to handle it. It doesn’t crawl pages that have already been visited.Q: Error in Controller.java- “Unhandled exception type Exception” A: Make sure Exception Handling is taken care of in the code. Since CrawlController class throws exception, so it needs to be handled inside a try-catch block.Q: Crawler cannot stop – when I set maxFetchPage to 20000, my script cannot stop and keeps running forever. I have to kill it by myself. However, it looks like that my crawler has crawled all the 20000 pages but just cannot end. A:Set a reasonable maxDepthofCrawling, Politeness Delay, setSocketTimeout(), and Number of crawlers in the Controller class, and retry. Also ensure there are no System.out.print() statements running inside the Crawler code.6. Submission InstructionsUSC ID ends with Site 01~20 CrawlReport_nytimes.txt 21~40 CrawlReport_wsj.txt 41~60 CrawlReport_foxnews.txt 61~80 CrawlReport_usatoday.txt 81~00 CrawlReport_latimes.txt● Also include the output files generated from your crawler run, using the extensions as shown above: o fetch_NewsSite.csv o visit_NewsSite.csv ● Do NOT include the output files o urls_NewsSite.csv where _NewSite should be replaced by the name from the table above. ● Do not submit Java code or compiled programs; it is not required. ● Compress all of the above into a single zip archive and name it: crawl.zip Use only standard zip format. Do NOT use other formats such as zipx, rar, ace, etc. For example the zip file might contain the following three files:1. CrawlReport_nytimes.txt, (the statistics file) 2. fetch_nytimes.csv 3. visit_nytimes.csv● Please upload your homework to your Google Drive CSCI572 folder, in the subfolder named hw2Appendix AUse the following format to tabulate the statistics that you collated based on the crawler outputs.CrawlReport_NewsSite.txt Name: Tommy Trojan USC ID: 1234567890 News site crawled: nytimes.com Number of threads: 7Fetch Statistics ================ # fetches attempted: # fetches succeeded: # fetches failed or aborted:Outgoing URLs: ============== Total URLs extracted: # unique URLs extracted: # unique URLs within News Site: # unique URLs outside News Site:Status Codes: ============= 200 OK: 301 Moved Permanently: 401 Unauthorized: 403 Forbidden: 404 Not Found:File Sizes: =========== < 1KB: 1KB ~

$25.00 View

[SOLVED] Csci544: homework assignment №3

This assignment is an extension to HW assignment 1. Please follow the instructions and submit a zipped folder containing: 1. A PDF containing a Jupyter Notebook response to the assignment with sufficient comments and explanations. Your Jupyter Notebook should contain both code and text cells with sufficient comments such that the reader can understand your solution as well as your responses for some of the questions. On the Jupyter notebook, please print the requested values, too. If it is more convenient, you can also submit a PDF similar to assignment 1, i.e., initially explaining your solution and then merge a Jupyter notebook. 2. You also need to submit your Jupyter Notebook separately in.ipynb format such that it can be easily executed. Please include the version of required dependencies. You can consider that data.tsv is a raw dataset in the current directory that your notebook should read and perform all the required steps and generate the desired outputs. You can use PyTorch or TensorFlow to implement the neural network models in this assignment. Please name your zipped file “HW3YourFirstName-YourLastName-L.zip”, where L should be either “PT” or “TF” which denotes whether you have used PyTorch or TensorFlow. You can also use publicly available implementations in portions of your solution but you need to include proper reference to your resources, e.g., url to the page that you used as a reference, books, etc. 1. Dataset Generation We will use the Amazon reviews dataset used in HW1. Load the dataset and build a balanced dataset of 60K reviews along with their ratings to create labels through random selection similar to HW1. You can store your dataset after generation and reuse it to reduce the computational load. For your experiments consider a 80%/20% training/testing split. 2. Word Embedding (25 points) In this part the of the assignment, you will generate Word2Vec features for the dataset you generated. You can use Gensim library for this purpose. A helpful tutorial is available in the following link: https://radimrehurek.com/gensim/auto_examples/tutorials/run_word2vec. html (a) (5 points) Load the pretrained “word2vec-google-news-300” Word2Vec model and learn how to extract word embeddings for your dataset. Try to check semantic similarities of the generated vectors using three examples of your own, e.g., King − Man + Woman = Queen or excellent ∼ outstanding. (b) (20 points) Train a Word2Vec model using your own dataset. You will use these extracted features in the subsequent questions of this assignment. Set the embedding size to be 300 and the window size to be 13. You can also consider a minimum word count of 9. Check the semantic similarities for the same two examples in part (a). What do you conclude from comparing vectors generated by yourself and the pretrained model? Which of the Word2Vec models seems to encode semantic similarities between words better? For the rest of this assignment, use the pretrained “word2vec-googlenews-300” Word2Ve features. 3. Simple models (20 points) Using the Google pre-trained Word2Vec features, train a single perceptron and an SVM model for the classification problem. For this purpose, use the average Word2Vec vectors for each review as the input feature (x = for a review with N words). Report your accuracy values on the testing split for these models similar to HW1, i.e., for each of perceptron and SVM models, report two accuracy values Word2Vec and TF-IDF features. What do you conclude from comparing performances for the models trained using the two different feature types (TF-IDF and your trained Word2Vec features)? 4. Feedforward Neural Networks (25 points) Using the Word2Vec features, train a feedforward multilayer perceptron network for classification. Consider a network with two hidden layers, each with 100 and 10 nodes, respectively. You can use cross entropy loss and your own choice for other hyperparamters, e.g., nonlinearity, number of epochs, etc. Part of getting good results is to select suitable values for these hyperparamters. You can also refer to the following tutorial to familiarize yourself: Although the above tutorial is for image data but the concept of training an MLP is very similar to what we want to do. (a) (10 points) To generate the input features, use the average Word2Vec vectors similar to the “Simple models” section and train the neural network. Report accuracy values on the testing split for your MLP. (b) (15 points) To generate the input features, concatenate the first 10 Word2Vec vectors for each review as the input feature ( ]) and train the neural network. Report the accuracy value on the testing split for your MLP model. What do you conclude by comparing accuracy values you obtain with those obtained in the “’Simple Models” section. 5. Recurrent Neural Networks (30 points) Using the Word2Vec features, train a recurrent neural network (RNN) for classification. You can refer to the following tutorial to familiarize yourself: https://pytorch.org/tutorials/intermediate/char_rnn_classification_ tutorial.html (a) (10 points) Train a simple RNN for sentiment analysis. You can consider an RNN cell with the hidden state size of 20. To feed your data into our RNN, limit the maximum review length to 20 by truncating longer reviews and padding shorter reviews with a null value (0). Report accuracy values on the testing split for your RNN model. What do you conclude by comparing accuracy values you obtain with those obtained with feedforward neural network models. (b) (10 points) Repeat part (a) by considering a gated recurrent unit cell. (c) (10 points) Repeat part (a) by considering an LSTM unit cell. What do you conclude by comparing accuracy values you obtain by GRU, LSTM, and simple RNN. Note: In total, you need to report accuracy values for: 2 (Perceptron + SVM) + 2 (FNN) + 3 (RNN) = 7 cases.

$25.00 View

[SOLVED] Csci544: homework assignment №1

This assignment gives you hands-on experience with text representations and the use of text classification for sentiment analysis. Sentiment analysis is extensively used to study customer behaviors using reviews and survey responses, online and social media, and healthcare materials for marketing and costumer service applications. The assignment is accompanied with a Jupyter Notebook to structure your code. Please submit: 1. A PDF report which contains answers to the questions in theassignment along with brief explanations about your solution. Please also print the completed Jupyter Notebook in PDF format and merge it with your report. (just submit one PDF file by merging your written answer and the completed Jupyter notebook). On your completed Jupyter notebook, please print the requested values, too. The libraries that you will need are included in the HW1.ipynb file. You can use other libraries as far as they decently similar to the ones included in the HW1.ipynb file. At the beginning of the .py file, add a read command to read the data.tsv file as the input to your .py file from the current directory. 1. Dataset Preparation (10 points) We will use the Amazon reviews dataset which contains real reviews for jewelry products sold on Amazon. The dataset is downloadable at: https://s3.amazonaws.com/amazon-reviews-pds/tsv/amazon_reviews_ us_Beauty_v1_00.tsv.gz (a) Read the data as a Pandas frame using Pandas package and only keep the Reviews and Ratings fields in the input data frame to generate data. Our goal is to train sentiment analysis classifiers that can predict the rating value for a given review. We create a three-class classification problem according to the ratings. The original dataset is large. To this end, let ratings with the values of 1 and 2 form class 1, ratings with the value of 3 form class 2, and ratings with the values of 4 and 5 form class 3. To avoid the computational burden, select 20,000 random reviews from each rating class and create a balanced dataset to perform the required tasks on the downsized dataset. Split your dataset into 80% training dataset and 20% testing dataset. Note that you can split your dataset after step 4 when the TF-IDF features are extracted. Follow the given order of data processing but you can change the order if it improves your final results. 2. Data Cleaning (20 points) Use some data cleaning steps to preprocess the dataset you created. For example, you can use: – convert all reviews into lowercase. – remove the HTML and URLs from the reviews – remove non-alphabetical characters – remove extra spaces – perform contractions on the reviews, e.g., won’t → will not. Include as many contractions in English that you can think of. You can use other cleaning procedures that can help to improve performance. You can either use Pandas package functions or any other built-in functions. Do not try to implement the above processes manually. In your report, print the average length of the reviews in terms of character length in your dataset before and after cleaning (to be printed by .py file). 3. Preprocessing (20 points) Use NLTK package to process your dataset: – remove the stop words – perform lemmatization In your report and the .py file, print the average length of the reviews in terms of character length in before and after preprocessing. 4. Feature Extraction (10 points) Use sklearn to extract TF-IDF features. At this point, you should have created a dataset that consists of features and labels for the reviews you selected. 5. Perceptron (10 points) Train a Perceptron model on your training dataset using the sklearn built-in implementation. Study about generalizations of Precision, Recall, and f1-score in multiclass situations. Report Precision, Recall, and f1-score per class and their averages on the testing split of your dataset. These 12 values should be printed in separate lines by the .py file. 6. SVM (10 points) Train an SVM model on your training dataset using the sklearn built-in implementation. Report Precision, Recall, and f1-score per class and their averages on the testing split of your dataset. These 12 values should be printed in separate lines by the .py file. 7. Logistic Regression (10 points) Train a Logistic Regression model on your training dataset using the sklearn built-in implementation. Report Precision, Recall, and f1-score per class and their averages on the testing split of your dataset. These 12 values should be printed in separate lines by the .py file. 8. Multinomial Naive Bayes (10 points) Train a Multinomial Naive Bayes model on your training dataset using the sklearn built-in implementation. Report Precision, Recall, and f1-score per class and their averages on the testing split of your dataset. These 12 values should be printed in separate lines by the .py file Note 1: For questions 5-8, part of grading is based on being competitive. For each question, we will sort the computed average precision values across the class. For each question, the top 40% will receive full credit. The next 30% will loose 1 point, and the bottom 30% will lose 2 points. We have this grading scheme to motivate you to explore ideas for increasing your performance values. Note 2: To be consistent, when the .py file is run, the following should be printed, each in a line: – Average length of reviews before and after data cleaning (with a comma between them) – Average length of reviews before and after data preprocessing (with comma between them) – Precision, Recall, and f1-score for the testing split in 4 lines (in the order of rating classes and then the average) for Perceptron (with comma between the three values) – Precision, Recall, and f1-score for the testing split in 4 lines (in the order of rating classes and then the average) for SVM (with comma between the three values) – Precision, Recall, and f1-score for the testing split in 4 lines (in the order of rating classes and then the average) for Logistic Regression (with comma between the three values) – Precision, Recall, and f1-score for the testing split in 4 lines (in the order of rating classes and then the average) for Naive Bayes (with comma between the three values) Note that in your Jupyter notebook, print the Precision, Recall, and f1score for the above models in separate lines and in .py file in separate lines.

$25.00 View

[SOLVED] Comp2710: project 3

Points Possible: 100Goals: • To learn streams and file I/O • To learn how to use tools for stream I/O • To use arrays to group data elements • To design and implement functions (Note: This topic was covered in Project 2) • To perform unit testing • To design a simple algorithmWrite a program that merges the numbers in two files and writes all the numbers into a third file. Your program takes input from two different files and writes its output to a third file. Each input file contains a list of numbers of type int in sorted order from the smallest to the largest. After the program is run, the output file will contain all the numbers in the two input files in one longer list in sorted order from smallest to largest.You must provide the following user interface. The user input is depicted as Bold, but you do not need to display user input in bold. Please replace “Li” with your name. Your program loads two input files and merges the numbers in the two files.*** Welcome to Li’s sorting program *** Enter the first input file name: input1.txt The list of 6 numbers in file input1.txt is: 4 13 14 17 23 89Enter the second input file name: input2.txt The list of 5 numbers in file input2.txt is: 3 7 9 14 151 The sorted list of 11 numbers is: 3 4 7 9 13 14 14 15 17 23 89 Enter the output file name: output.txt *** Please check the new file – output.txt *** *** Goodbye. ***Your program must follow the above user interface. Design Issues: Please do NOT intend to implement this project using a single main() function. You need at least three functions to implement this project. The suggested functions are: 1. array_size filename; cout

$25.00 View

[SOLVED] Comp2710: project 2

Points Possible: 100Goals: • To learn “while” and “do-while” statements • To learn how to define functions • To write a test driver • To learn how to use assert() • To use random numbersDescription: In the land of Puzzlevania, Aaron, Bob, and Charlie had an argument over which one of them was the greatest puzzle-solver of all time. To end the argument once and for all, they agreed on a duel to the death (this makes sense?). Aaron was a poor shooter and only hit this target with a probability of 1/3. Bob was a bit better and hit his target with a probability of 1/2. Charlie was an expert marksman and never missed. A hit means a kill and the person hit drops out of the duel.To compensate for the inequities in their marksmanship skills, the three decided that they would fire in turns, starting with Aaron, followed by Bob, and then by Charlie. The cycle would repeat until there was one man standing. That man would be remembered for all time as the Greatest Puzzle-Solver of All Time.Strategy 1: An obvious and reasonable strategy is for each man to shoot at the most accurate shooter still alive, on the grounds that this shooter is the deadliest and has the best chance of hitting back.Write a program to simulate the duel using this strategy. Your program should use random numbers and the probabilities given in the problem to determine whether a shooter hits his target. You will likely want to create multiple functions to complete the problem. My solution had only one function to simulate the duels and it passed in the odds and the three guys as pass-by-reference parameters. Once you can simulate a duel, add a loop to your program that simulates 10,000 duels. Count the number of times that each contestant wins and print the probability of winning for each contestant (e.g., for Aaron your might output “Aaron won 3612/10000 duels or 36.12%).Strategy 2: An alternative strategy for Aaron is to intentionally miss on his first shot, while the rest of duel is as same as that in Strategy 1. Write a function to simulate Strategy 2. Your program will determine which strategy is better for Aaron.*** Welcome to Li’s Duel Simulator *** Unit Testing 1: Function – at_least_two_alive() Case 1: Aaron alive, Bob alive, Charlie alive Case passed … Case 2: Aaron dead, Bob alive, Charlie alive Case passed … Case 3: Aaron alive, Bob dead, Charlie alive Case passed … Case 4: Aaron alive, Bob alive, Charlie dead Case passed … Case 5: Aaron dead, Bob dead, Charlie alive Case passed … Case 6: Aaron dead, Bob alive, Charlie dead Case passed … Case 7: Aaron alive, Bob dead, Charlie dead Case passed … Case 8: Aaron dead, Bob dead, Charlie dead Case passed … Press any key to continue…Unit Testing 2: Function Aaron_shoots1(Bob_alive, Charlie_alive) Case 1: Bob alive, Charlie alive Aaron is shooting at Charlie Case 2: Bob dead, Charlie alive Aaron is shooting at Charlie Case 3: Bob alive, Charlie dead Aaron is shooting at Bob Press any key to continue…Unit Testing 3: Function Bob_shoots(Aaron_alive, Charlie_alive) Case 1: Aaron alive, Charlie alive Bob is shooting at Charlie Case 2: Aaron dead, Charlie alive Bob is shooting at Charlie Case 3: Aaron alive, Charlie dead Bob is shooting at Aaron Press any key to continue…Unit Testing 4: Function Charlie_shoots(Aaron_alive, Bob_alive) Case 1: Aaron alive, Bob alive Charlie is shooting at Bob Case 2: Aaron dead, Bob alive Charlie is shooting at Bob Case 3: Aaron alive, Bob dead Charlie is shooting at Aaron Press any key to continue…Unit Testing 5: Function Aaron_shoots2(Bob_alive, Charlie_alive) Case 1: Bob alive, Charlie alive Aaron intentionally misses his first shot Both Bob and Charlie are alive. Case 2: Bob dead, Charlie alive Aaron is shooting at Charlie Case 3: Bob alive, Charlie dead Aaron is shooting at Bob Press any key to continue…Ready to test strategy 1 (run 10000 times): Press any key to continue…Aaron won 3593/10000 duels or 35.9% Bob won 4169/10000 duels or 41.69% Charlie won 2238/10000 duels or 22.38%Ready to test strategy 2 (run 10000 times): Press any key to continue…Aaron won 4131/10000 duels or 41.31% Bob won 2594/10000 duels or 25.94% Charlie won 3275/10000 duels or 32.75%Strategy 2 is better than strategy 1.Requirements:1. You must follow the above user interface to implement your program.2. You must implement the following functions:1) bool at_least_two_alive(bool A_alive, bool B_alive, C_alive) /* Input: A_alive indicates whether Aaron is alive */ /* B_alive indicates whether Bob is alive */ /* C_alive indicates whether Charlie is alive */ /* Return: true if at least two are alive */ /* otherwise return false */2) void Aaron_shoots1(bool& B_alive, bool& C_alive) /* Strategy 1: Use call by reference * Input: B_alive indicates whether Bob alive or dead * C_alive indicates whether Charlie is alive or dead * Return: Change B_alive into false if Bob is killed. * Change C_alive into false if Charlie is killed. */3) void Bob_shoots(bool& A_alive, bool& C_alive) /* Call by reference * Input: A_alive indicates if Aaron is alive or dead * C_alive indicates whether Charlie is alive or dead * Return: Change A_alive into false if Aaron is killed. * Change C_alive into false if Charlie is killed. */4) void Charlie_shoots(bool& A_alive, bool& B_alive) /* Call by reference * Input: A_alive indicates if Aaron is alive or dead * B_alive indicates whether Bob is alive or dead * Return: Change A_alive into false if Aaron is killed. * Change B_alive into false if Bob is killed. */5) void Aaron_shoots2(bool& B_alive, bool& C_alive) /* Strategy 2: Use call by reference * Input: B_alive indicates whether Bob alive or dead * C_alive indicates whether Charlie is alive or dead * Return: Change B_alive into false if Bob is killed. * Change C_alive into false if Charlie is killed. */2. You must implement five unit-test drivers (five functions) to test the above five functions (see the example output on pages 2 and 3).3. You must use assert in your test driver.4. You must define at least three constants in your implementation. For example, the total number of runs (i.e., 10,000) can be defined as a constant.Hints: 1. How to implement “Press any Enter to continue…”cout

$25.00 View

[SOLVED] Cs-7639-o01 project #3-crazyflie design and analysis using aadl

CS-7639-O01 Project#3 Crazyflie design and analysis using AADL About this project For this project, we aim at using AADL to analyze an existing design of a small UAV, the Crazyflie, and then to extend it to add new capabilities. This project is organized in multiple parts: • Part 0 is an introduction to AADL and toolchain. It is provided as a reference; • Part 1 is a walkthrough of the provided Crazyflie model, where you’ll perform multiple analysis and then expand the model; • Part 2 revisits the Crazyflie models, with the objective to perform safety analysis. Deliverables Part 1 • A PDF report with answers to questions • A .zip archive with updated models Part 2 • A PDF report with answers to questions • A .zip archive with updated models CS-7639-O01 – Project #3 Crazyflie design and analysis using AADLPART 0 – AADL LANGUAGE AND TOOLCHAINIn this part, we introduce the basic elements of the AADL Language and its toolchain.Suggested reading: the following provide a comprehensive first reference to the AADL: • Julien Delange, AADL in Practice, http://aadl-book.com1. About AADLThe “Architecture Analysis and Design Language” (AADL) is both a textual and graphical language for model-based engineering of embedded real-time systems. AADL is used to design and analyze software and hardware architectures of embedded real-time systems. The AADL allows for the description of both software and hardware parts of a system. It focuses on the definition of clear block interfaces, and separates the implementations from these interfaces. From the separate description of these blocks, one can build an assembly of blocks that represent the full system. To take into account the multiple ways to connect components, the AADL defines different connection patterns: subcomponent, connection, and binding. An AADL model can incorporate non-architectural elements: non-functional properties (execution time, memory footprint, . . .), behavioral or fault descriptions. Hence it is possible to use AADL as a backbone to describe all the aspects of a system. Let us review these elements: An AADL description is made of components. Each component category describes wellidentified elements of the actual architecture, using the same vocabulary of system or software engineering. The AADL standard defines software components (data, thread, thread group, subprogram, process) and execution platform components (memory, bus, processor, device, virtual processor, virtual bus) and hybrid components (system) or imprecise (abstract). Component declarations have to be instantiated into subcomponents of other components in order to model architecture. At the top-level, a system contains all the component instances. Most components can have subcomponents, so that an AADL description is hierarchical. A complete AADL description must provide a top-most system that will contain certain kind of components (processor, process, bus, device, abstract and memory), thus providing the root of the architecture tree. The architecture in itself is the instantiation of this system, which is called the root system. The interface of a component is called component type. It provides features (e.g. communication ports). Components communicate one with another by connecting their CS-7639-O01 – Project #3 Crazyflie design and analysis using AADL features. To a given component type correspond zero or several implementations. Each of them describes the internal structure of the components: subcomponents, connections between those subcomponents. They can also refine non- functional properties, or add new ones. The AADL defines the notion of properties. They model non-functional properties that can be attached to model elements (components, connections, features, instances, etc.). Properties are typed at- tributes that specify constraints or characteristics that apply to the elements of the architecture such as clock frequency of a processor, execution time of a thread, bandwidth of a bus. Some standard properties are defined, e.g. for timing aspects; but it is possible to define new properties for different analysis (e.g. to define particular security policies). Besides, the language is defined by a companion standard document that defines legality rules for component assemblies, its static and execution semantics. The Figure 1 illustrates a complete space system, used as a demonstrator during the ASSERT project. It illustrates how software and hardware concerns can be separately developed and then combined in a complete model.Figure 1 AADL Assert Demonstrator 2. About AADL toolchainThe correct engineering of Cyber Physical Systems entails the designer to perform multiple types of analysis on a candidate architecture.• Semantic analysis: is the model correct w.r.t. to basic concerns? e.g. interfaces, containment hierarchy, consistency of configuration parameters, etc.;CS-7639-O01 – Project #3 Crazyflie design and analysis using AADL • Safety analysis: are the failures modes correctly defined? handled? To support these analyses, we will use OSATE, from http://osate.org, developed by CMU/SEI. This is the reference implementation of an AADL toolchain. It supports text or graphical editing, timing and safety analysis; 3. Using a virtual machine [recommended] A virtual machine is provided with all the tools (see section 4 below). It is available at the address: https://drive.google.com/file/d/1oN9hWWVP9ocj3gKWPhRiVrDr0sUcU08L/view? usp=sharing login: cs7639 / password: pqsszd To open the ova file, 1. download the ova file to your computer 2. install VMWare Workstation (https://www.vmware.com/ products/workstation-player.html) 3. open the ova file with VMWare using the default settings Make sure you read the readme file on the desktop of the VM. 4. Installation of the toolchain on your own machine It is expected you can install all tools on a laptop in less than 1.5 hours. Simply refer to each web site for details. We strongly suggest you opt for a Linux-based installation. • Download OSATE from http://osate.org/osate-releases.html#stable-releases • Install Cheddar plug-in: go in Help -> Install Additional OSATE Components, and Cheddar from http://beru.univ-brest.fr/cheddar/ 5. Project models For this homework and the future project, we will use the AADL models from the following repository: https://github.com/OpenAADL/Crazyflie It is advised you clone the repository, this would allow for future updates of the models. The project files have been setup already in the virtual machine provided. CS-7639-O01 – Project #3 Crazyflie design and analysis using AADL 6. Guided tour of the toolsa. OSATE[ Note that OSATE is updated regularly, some modifications on the GUI have been done recently. The following text is compatible with OSATE2.4.0. ]• Start OSATE.The workspace is organized using regular conventions of IDE: • The left panel is the list of files, • The central panel the current model/file being edited, • The right panel is the outline, with the list of model entities (system, thread, process, etc.).• To update the repository: right click on the Crazyflie project in the left panel, then select Team -> PullNote: some graphical models have been built and organized, some others not. Expect some obfuscated diagrams.Note #2: a right-click on the diagram allows to select elements that will be visible. Some properties, connections might not be visible immediately.• To analyze a model, one has to identify the “root” of the system. The root is defined as the top-most component of a model from the outline (right panel). We have two root systems: o Crazyflie_Functional_Chain.Impl, in crazyflie_functional.aadl o Crazyflie_System.Impl, in crazyflie_system.aadlAnalyzing an AADL model is a two-step process: 1. Instantiate the root system: in the right panel that lists all components, rightclick on the top-most system implementation (i.e. System Crazyflie_System.impl) and select instantiate. This will generate a .aaxl2 file in the models/instances directory 2. Perform the analysis: right click on the generated file, and select the analysis from the AADL Analyses menu. 3. For Cheddar, select the instance file, then the “Cheese” icon in the main menu bar to generate the Cheddar XML file. Then run CheddarSupported analyses are: o Safety –> Run Fault Tree Analysis / Fault Impact Analyses o Timing –> Check Flow Latency o Scheduling Analysis (using Cheddar)CS-7639-O01 – Project #3 Crazyflie design and analysis using AADLPART 1 – Unboxing the Crazyflie About the Crazyflie UAV The Crazyflie 2.0 is a versatile flying development platform that weighs only 27g and fits in the palm of your hand. The Crazyflie 2.0 is an open source project, with source code and hardware design both documented and available. All information is available from BitCraze.io website: https://www.bitcraze.io/crazyflie-2/. Crazyflie 2.0 System Architecture Crazyflie 2.0 is architectured around 2 microcontrollers: § A NRF51, Cortex-M0, that handles radio communication and power management: § ON/OFF logic § Enabling power to the rest of the system (STM32, sensors and expansion board) § Battery charging management and voltage measurement § Master radio bootloader § Radio and BLE communication § Detect and check installed expansion boards § An STM32F405, Cortex-M4@160MHz, that handles the heavy work of flight control and everything else: § Sensor reading and motor control § Flight control § Telemetry (including the battery voltage) § Additional user development See Figure 2 for more details. The nRF51822 The two main tasks for the nRF51 is to handle the radio communication and the power management. It acts as a radio bridge (it communicates raw data packet to the STM). Crazyflie 2.0 use the radio for both CRTP and BLE, but the hardware also supports other protocols like ANT. The CRTP mode is compatible with the Crazyradio USB dongle and it provides a 2Mbit/seconds data link with low latency. Test shows that the latency of the radio link is between 360us and 1.26ms, at 2Mbps without retry and a packet size of respectively 1 and 32 bytes. The minimum achievable latency with Bluetooth is 7.5ms but current implementation is more around 20ms. The main benefit of the CRTP link with the Crazyradio is that it’s easily implemented on any system that supports USB host which, makes it the first choice to hack and experiment with the Crazyflie. BLE is implemented mostly with the use case of controlling the Crazyflie 2.0 from a mobile device. One of the other particularities of the nRF51 chip is that it was designed to run from a coin battery, which means that it is pretty well suited for low energy operation. So, the NRF51 is also responsible for power management. It handles the ON/OFF logic which means that the NRF51 is always powered and that different action are possible when pressing the ON/OFF button for a long time (i.e. this is used to start the bootloader). It is also possible to wake Crazyflie 2 from one pin of the expansion port, which allows wake-up by an external source.Figure 2 Crazyflie 2.0 system architecture The STM32F405 The STM32 runs the main firmware. Even though it is started by the NRF51, it acts as a master towards the NRF51. It implements flight control, and all communication algorithm. The expansion port is mainly connected to the STM32 so the driver for expansion boards sits in the STM as well. The STM32F405 has 196kB of RAM which should be enough for anyone (famous last words…). This is overkill for just the flight controller but it allows for more computationally intensive algorithms, for example sensor fusion between inertial sensors and the GPS data. Inter-MCU communication The communication between the two CPUs is handled by the syslink protocol. It is a simple packetbased protocol we made to have an extensible communication scheme. Syslink provides messages for carrying all required communication between the CPUs. The STM32 is the master and the NRF51 the slave. As much as possible we try to keep the NRF51 simple and stupid to offload complex algorithm in the STM32. Example of syslink message are: § Raw radio packets, to be sent and received § Power management measurementCrazyflie 2.0 Controller architectureThe following images illustrate the architecture of the controller at system-level and implementation levels. The controller is split in two sub-controllers: one Attitude PI-Controller running at 250Hz, and one Rate P-Controller running at 500Hz.Note: the detail of the controllers is outside the scope of the assignments.AADL modeling of the CrazyflieWe built a set of AADL models to serve as an entry point for our project. The models are both graphical and textual, they are organized as follows:• crazyflie_functional.aadl: abstract functional chain of the Crazyflie, adapted from the functional architecture • crazyflie_hardware.aadl: hardware part of the UAV, capturing the various hardware elements. • crazyflie_software.aadl: software part of the UAV, it is a candidate implementation only • crazyflie_types.aadl: list AADL data types used for the component interfaces ; • crazyflie_system.aadl : one candidate full system, combining the hardware and software elements; • crazyflie_final.aadl : mapping of the abstract functional chain to the candidate implementation.These models capture the outcome of a typical design flow for CPS. • First, we built the high-level functional chain (crazyflie_functional.aadl), • Then, we built the hardware (crazyflie_hardware.aadl) and software (crazyflie_software.aadl) candidates and one system combining them (crazyflie_system.aadl). By combining, we mean that the software elements are mapped to hardware ones. • Later, we built the crazyflie_final model to ensure all functions are bound to implementation elements.Part 1.1: Flow latency analysisNote: in this part, we will use OSATE to analyze and extend the model.Suggested reading: “Impact of Runtime Architectures on Control System Stability” by Peter H. Feiler and Jürgen Hansson, ERTS2008, https://hal-insu.archives-ouvertes.fr/insu-02270102The proposed model has a few flows modeled through the system. In AADL, a flow captures the propagation of data and its processing by various model elements such as devices and threads. Flows are an interesting capability to model expected end-to-end latency in processing chain. Our CrazyFlie UAV has multiple paths from sensors to actuators, and it would be beneficial to review them all.In AADL, a flow is a piecewise definition of data “flowing” through component interfaces. The root of a flow is an “end to end flow” that lists its constituent, or atomic flow.Here is the definition of one end-to-end flow from the Crazyflie_System package:flows — etef1 represents the functional chain from MPU9250 IMU to STM32 — firmware to one of the propeller.etef1 : end to end flow MPU9250.f1 -> C11 -> STM32F405_Firmware.f2 -> C12 -> M1.f1 { latency => 0 ms .. 2 ms;};Each subcomponent (MPU9250, STM32F405_Firmware, etc.) lists additional flow, here f1 and f2 respectively. An end-to-end flow starts with a flow source (e.g. a sensor) and ends with a flow end (e.g. an actuator)Q2: add the additional end-to-end flows from Q1 in the AADL models, as well as necessary flow source/path/sink.For each of them, you’ll use specific notation to indicate the modified elements. Suggested annotation is to use AADL comments, like— John Doe begin of addition for question Q2 — John Doe end of addition for question Q2where John Doe is changed to your actual name, and question adjusted accordingly.Q3: Execute the latency analysis on the instantiated model with the default settings. What do you observe on the obtained values?Considering that the Crazyflie has a symmetrical mechanical design, can you consider that some of the flows are redundant, i.e. capture the same behavior? Remember that flows measure the latency, not the value, of a signal.Q4: In the description of the Crazyflie architecture, the latency requirements for various aspects of the system are detailed (see figures page 9). You should convert the frequency to period. List, if any, the end-to-end flows that these requirements affect. How will the end-to-end latency for these flows be affected by the requirements? Does the current architecture meet these requirements? If not, which end-to-end flows need additional timing allocation?Part 1.2 Simulation of the modelNote: in this part, we will use Cheddar to simulate the Crazyflie_System model.A typical error when building models is lacking ways to “debug” your system. Simulation is one possible option to address this issue.device MPU9250 features DOF6 : out data port Crazyflie_Types::Six_Axis.impl; i2c_bus : requires bus access Buses::I2C::I2C.impl;properties Dispatch_Protocol => Periodic; Period => 10 ms;Another example is the definition of this thread:thread Power_Management properties Priority => 2; Dispatch_Protocol => Periodic; Period => 500 us; Compute_Execution_Time => 10 us .. 20 us; end Power_Management;In this model, we use two dispatch protocols: • Periodic: the component will be executed periodically • Sporadic: the component will be executed only if an event is received on one of its event (data) port, and after some time elapsed represented by the Period property.Part 1.3 Scheduling analysis of the modelIn this part, we continue exploring the analysis of the model with Cheddar.Scheduling analysis provides a mechanism to assess schedulability of a system. Cheddar has multiple plug-ins to perform scheduling analysis.Q6: run scheduling analysis on the system by clicking the gear button, what can you conclude?Q7: how would you relate flow analysis, simulation and scheduling analysis? What is their benefit in a complete Systems Engineering process?Part 1.4 Adding new component: Flow DeckThis page: https://wiki.bitcraze.io/projects:crazyflie2:expansionboards:flow lists an expansion board for the CrazyFlie. In order to improve the maneuverability of the UAV, one needs to add an additional sensor to evaluate lateral movement of the UAV. This solves the “sliding” behavior of the drone, at the expanse of extended computation. In this part, we want to add the Flow Deck expansion board to the initial model.One strategy is to define the corresponding devices in the hardware package, and add them to the hardware system, and add related model blocks to the software and functional models.To avoid losing the previous model, we want to use the refinement/extension capabilities of AADL to add elements related to the Flow Deck expansion board.system implementation Crazyflie_System.impl extends Crazyflie_Hardware::Crazyflie.impl subcomponents nRF51822_Firmware : process Crazyflie_Software::nRF51822_Firmware; STM32F405_Firmware : process Crazyflie_Software::STM32F405_Firmware.impl;Q8: provide an expanded model with the Flow Deck integrated. Ensure all previous analysis are still feasible. You’ll provide the rationale for all the updates you performed in your report.PART 2 – Safety AnalysisIn this part, we will evaluate the safety architecture of the UAV through multiple analysis mandated by the ARP4761 standard.Additional reading: [1] Julien Delange, Peter H. Feiler, David P. Gluch, John J. Hudak, “AADL Fault Modeling and Analysis Within an ARP4761 Safety Assessment”. This report explains how to leverage the Error-Model version 2 to follow the ARP4761 safety process. The full report is available as SEI Technical Report from http://resources.sei.cmu.edu/library/asset-view.cfm?assetID=311884This document lists in section 1 and 2 how to map AADL to analysis that are mandated by the ARP4761 safety process. This process is part of the many steps to create a certified system. In the following, we will apply some safety analysis to the Crazyflie model.Important Notes:Functional Hazard Analysis of the functional chainAn FHA is a systematic examination of functions to identify and classify failure conditions of those functions according to their severity. The AADL Error Model Annex supports the FHA through property assignments within an AADL architecture model.Users can generate an FHA report using the Error Model Annex and OSATE, by assigning Hazard, Likelihood, and Severity property values to points of failure. Then, OSATE can generate the FHA report.The EMV2 is an annex language, it adds a set of new concepts and associated language elements to the model.The Crazyflie_Errors package defines a set of error types that can be propagated: ValueError, ValueErroneous and Lost. • ValueError indicate the value cannot be produced; • ValueErroneous indicates an incorrect value has been produced; • Lost means the component is now defunct, and the associated function is lost.The Crazyflie_Errors package also defines an empty state machine, made of two states, Operational and Failed. This state machine will serve as a skeleton to build the failures modes of our UAV component.In the following example, we attach probability of occurrence of entering in the Failed mode, and some documentation attached to this faulty situation, in addition to its severity and likelihood. Section 2.10.1 of [1] provides a complete definition of the Hazards property.abstract Accelero features Accelero_Out : out data port; — [..] annex emv2 {** use types Crazyflie_Errors; — definition of error types use behavior Crazyflie_Errors::simple; — definition of error modes properties — Useful for FHA reports EMV2::OccurrenceDistribution => [ ProbabilityValue => 1.0e-9 ; Distribution => Poisson;] applies to Failed; EMV2::severity => 1 applies to Failed; EMV2::likelihood => C applies to Failed; EMV2::hazards => ([ crossreference => “”; failure => “Loss of sensor readings”; phases => (“all”); description => “Sensor failure”; ]) applies to Failed; **}; end Accelero;A typical issue when building this analysis is to get meaningful value for the probability of occurrence of errors for a component. OEMs usually do not publish them on a regular basis, except for military-grade ones . Here, we assumed an unrealistic value of 10-9.Q10: Generate the corresponding FHA report using OSATE.Note: you’ll observe OSATE simply aggregates the elements from the model. The additional benefit is that the modeling language performs cross-check on the name of the failure modes, the coverage of modes etc. These ensure the report is consistent and complete.Reliability Analysis of the functional chainTo achieve this goal, one needs to capture the failure mode of the system as a composition of the failure mode of the system. This can be captured in Crazyflie_Functional_Chain.impl using an EMV2 subclause and a composite error behavior. This behavior refined the error automata of the system with a Boolean equation stating when the system is either in Operational or Failed mode. Obviously, the system is in Operational mode when all sensors are Operational, but this is also incomplete.system implementation Crazyflie_Functional_Chain.impl — [..] annex EMV2 {** use types Crazyflie_Errors; use behavior Crazyflie_Errors::simple; composite error behavior states [ Acc.Operational and Gyro.Operational and Magneto.Operational ]-> Operational; [ TBD ]-> Failed; end composite; **}; end Crazyflie_Functional_Chain.impl;Q11: What are the conditions for all elements to be either in the Operational or Failed modes? Extend the model accordingly. What is the failure probability you get, using the “Fault Tree Analysis” plug-in?Taking into account software “failure”abstract Accelero — [..] annex emv2 {** use types Crazyflie_Errors; use behavior Crazyflie_Errors::simple; error propagations — outgoing error propagation Accelero_out: out propagation {ValueError}; flows –When the sensor fails, its error is propagated through port Accelero_Out ErrorSource: error source Accelero_out {ValueError} when {ValueError}; end propagations; — [..] **}; end Accelero;Q12: Update the model to capture all error sources (components) in the functional chain, using the Accelero abstract component as a template.The Sensor_Fusion abstract component is in charge of combining the inputs from the three sensors. It outputs one vector state that corresponds to the attitude of the UAV.• the component error behavior captures the error automata of the system. The TBD part is a Boolean equation that captures how the system moves from the Operational to the Failed mode. The propagations part indicates the error being reported in case of an error.abstract Sensor_Fusion — [..] annex EMV2 {** use types Crazyflie_Errors; use behavior Crazyflie_Errors::simple;error propagations Accelero_In : in propagation {ValueError}; Data_F_Out : out propagation {ValueError}; flows f1 : error path Accelero_In -> Data_F_Out; end propagations;component error behavior transitions t1 : Operational -[TBD ]-> Failed; propagations Failed -[]-> Data_F_Out{ValueError}; end component; **}; end Sensor_Fusion;Q13: propose and implement an update to the model that captures the following hypothesis on the fusion algorithm used: any error as input will translate as an error as output.Q14: run again the Fault Tree analysis, how does the value compare with the previous one? Is it expected?Q15: the fault impact analysis plug-in allows one to see how an error propagated in the functional chain. Execute the plug-in and compare the output to your model. How can you link each element of the fault impact analysis to model elements?Q16: for the moment, we mostly performed basic updates on the system. Complete the error model by adding failures on motors, and the propagation of error value through the controller.

$25.00 View

[SOLVED] Cpe2600 lab 12- multithreading

 Acknowledgement: Many aspects of this assignment are modeled after this: Operating Systems Principles (fiu.edu). See additional acknowledgements in the template code. Topical Concepts Multithreading Like multiprocessing, multithreading can deliver a number of benefits, again, with the most obvious being some speedup by carrying out processing in parallel. In the previous lab, we employed multiprocessing by running separate, independent processes. This week we will “parallelize” within each processes by using separate threads each calculating a different portion of the Mandelbrot image. Threads are ideal for this addition as opposed to processes since threads typically share heap memory without additional overhead. A summary of the activity will be to add an option to the mandel program which will tell it how many threads to split into to calculate a single image. You will then benchmark the program with various combinations of multithreading and multiprocessing. No Lab 11 features should be removed. The bulk of your changes will relate to how the loops in compute_image() operate.Note that loops counters i and j simply iterate to each pixel of the image and call iterations_at_point() to get the “color” at that pixel. We can easily parallelize this by splitting the image up into 2 or more distinct regions and have a separate thread iterate its own region. The Exercise Specific Instructions 1. You will work in the same repository as the last assignment. You shall create a new branch “lab12dev” and do all new work on that branch. In the meantime, the main branch of your repository should remain at the completed, deployable Lab 11 lab project.2. Add a command line argument to set the number of threads to be used to calculate an image. The exact place that you make your changes will depend a bit on how you completed Lab 11. If you left mandel.c mostly unchanged you can simply add a new command line argument to the mandel program to give it the number of threads it should split into. If you modified the mandel program to become mandel movie, you again would simply add a new command line argument. Do not remove any features from Lab 11 – we will ultimately be doing both multiprocessing and multithreading.3. Using “pthreads” spin off the requested number of threads to calculate a single image. A couple of notes:b. You will need a way to pass each thread’s region to it. You can pass a void* to the thread method. Keep in mind that that pointer could point to memory that contains that thread’s assignment or, a pointer is 8 bytes on our system and it does not have to be interpreted as a memory address. c. You should support a minimum of 1 thread up to a maximum of 20.4. Finally, re-run your runtime data collection from last week by also varying the number of threads used in addition to number of processes. Create a table with # threads on one axis and # processes on the other. In each cell place the runtime to generate 50 images with that combination. Add to your exiting report in README.md:a. A brief overview of your implementation of multiple threads. b. The table described above. c. A brief discussion of your results. Answer the following questions: i. Which technique seemed to impact runtime more – multithreading or multiprocessing. Why do you think that is? ii. Was there a “sweet spot” where optimal (minimal) runtime was achieved? Deliverable • When you are ready to submit your assignment prepare your repository: • Make sure your name, assignment name, and section number are all files in your submission – in comment block of source file(s) and/or at the top of your report file(s). • Make sure you have completed all activities and added all new program source files to repository. • Make sure your assignment code is commented thoroughly. • Make sure all files are committed and pushed to the main branch of your repository. • Tag your repo with the tag “`Lab12vFinal“` ***NOTE***: Do not forget to ‘add’, ‘commit’, and ‘push’ all new files, branches, and changes to your repository before submitting. To submit, copy the URL for your repository and submit the link to associated Canvas assignment.

$25.00 View

[SOLVED] Cpe2600 lab 11- multiprocessing

Lab 11 Multiprocessing Acknowledgement: Many aspects of this assignment are modeled after this: Operating Systems Principles (fiu.edu). See additional acknowledgements in the template code. Topical Concepts Multiprocessing Although there are a number of benefits of multiprogramming, the most obvious is to achieve speedup of lengthy operations. To really explore these benefits, we have to have a lengthy operation. How about generating an image of a fractal? The Mandelbrot set is one such calculation that can be made and easily visualized. Depending on a variety of parameters, generation of a single image of the Mandelbrot set can take up to a few seconds on modern computing hardware. To aid in visualization often many images are calculated by varying parameters such as origin or scale factor then pieced into a movie. Examples can be seen here: Mandelbrot set – Wikipedia. The primary goal of this phase of the project will employ multiprocessing to enjoy a speedup while generating Mandelprot plots. A program in the starter repository (mandel) has been provided that features a command line interface which allows adjustment of various parameters and will then generate a Mandelbrot set visualization and save it to a jpeg file. Note that mandel uses ‘getopt’ to process command line arguments. You will need to use getopt for your new program or extend it to accommodate the new command line options if you choose to modify the existing program. Once you have your 50 frames (or more if you wish) you can stitch them together into a movie using ffmpeg (you will need to install – sudo apt update / sudo apt install ffmpeg). Note, you will also need to install a development library to handle jpeg files to build mandel (sudo apt install libjpeg-dev). Once everything is working, use the time command to measure the time it takes to generate the 50 images with various allowed numbers of children processes (1, 2, 5, 10, 20 for example). Plot the results. Does using more processes always speedup the operation? See below for the deliverable. In Lab 12 we will investigate a multithreaded version. Git Source Version Control – Next Steps Branches So far we have been using git in its simplest form, really just as a tool to track a “linear” set of changes to a software project. We have an easy way to track changes, to mark milestones, and to undo changes. This is quite useful by itself if you are a solo developer and are not really sharing your code or repository with anyone. In reality, you will rarely be a solo developer, and your repository will likely be accessed by other developers and stakeholders. So, we must introduce additional features to deal with more complicated scenarios. The first new concept is that of a branch. Like many things, this can get very complicated, so we will just be scratching the surface here as well. So far, we have just a single branch in our repository. By default, it is named “main.” As we made changes, we committed those changes to the main branch. However, if we are working with other developers or even just publishing our repository for public consumption of our project, there is a expectation that the main branch is always deployable. That is, a tested, stable version of the project. In addition, the numerous commits that you make leading up to that next stable version can ultimately create a lot of clutter. So, you should be doing your actual development on a branch, leaving the main branch unchanged until the new features developed are fully tested and ready to deploy on the branch. At that point the branch can be merged back to the main branch. Some guidelines suggest maintaining a parallel development branch and only merge to main for releases. With the -b option, this will create and switch to the new branch. Any commits made now will be on the branch. You can go back to the main branch with another git checkout command, but be careful. Any commits made on the main branch will complicate merging the branches later. We can commit to this branch now, but, technically the branch does not exist in the remote repository, so right now, at least, we cannot push. To do that, there is one additional step we need to take. The following command will push to remote and specifically make the same branch on the remote repository.Now we can commit changes locally and push to remote the same way we have always done. Until we checkout a different branch or commit in the repo, we will be adding all commits on the branch (which is what we want). Interestingly, as long as no commits are made to main, commits will still appear “linear.” If you were to graph commits, you might see this:Merging Ultimately, changes made on a branch will need to be merged back onto the main branch. As stated, ideally, no commits have been made to main since you are doing your work on a branch. So, the process of a merge is pretty trivial. Be sure that you have no uncommitted changes when you start a merge, or you might wind up with some unintended issues. If your current location (HEAD) is still on the branch, and you attempt to merge with main, nothing really happens because, although it is called a branch, if no commits have been made on main, there is no bifurcation in the history. To successfully merge to main, you will need to change your location to the main branch (checkout) and issue the merge from there.Note the message – “Fast-forward”. This basically acknowledges no changes were made to main since the branch. If changes had been made, it would be a “true merge” or a “three-way merge” which has a potential for conflicts. If there are conflicts you will have the opportunity to under the merge or proceed. If you proceed, your files will be marked where there are conflicts and you will need to manually fix them. A pain in a large project, for sure. You can push to remote when done and have your new “release” ready to go. The Exercise Specific Instructions 1. Accept the new GitHub Classroom assignment at https://classroom.github.com/a/0nNkoC5K and clone the repository.2. Before making any changes, create a branch name “labWeek11dev” and switch to that branch. Push the branch to the remote repo as shown above.5. Collect runtime for 1, 2, 5, 10, and 20 children processes. Plot into a graph with # processes on the X axis and runtime on the Y axis. Prepare a brief report in README.md (edit the one in your repository). In the report, include:a. A brief overview of your implementation. b. The graph of your runtime results. You will need to export the plot (from Excel) into an image, add the image to your repo, and then link it into the README.md. c. A brief discussion of your results.6. Create a movie of your 50 images. You can use the ffmpeg tool – this command should work: ffmpeg -i mandel%d.jpg mandel.mpg *** We will demo a few people’s movie in the lab. Include your best movie in your repo for review and grading. *** Deliverable • When you are ready to submit your assignment prepare your repository: • Make sure your name, assignment name, and section number are all files in your submission – in comment block of source file(s) and/or at the top of your report file(s). • Make sure you have completed all activities and added all new program source files to repository. • Make sure your assignment code is commented thoroughly. • Make sure all files are committed and pushed to the main branch of your repository. • Tag your repo with the tag “`vFinal“` • Include your best movie in the repo (we would not normally include such a file in a repo but for grading we do) ***NOTE***: Do not forget to ‘add’, ‘commit’, and ‘push’ all new files, branches, and changes to your repository before submitting. To submit, copy the URL for your repository and submit the link to associated Canvas assignment and add a comment on Canvas that you have completed Lab 11.

$25.00 View

[SOLVED] Cs7280 assignment 1- introduction to networkx 2025

5/5 - (1 vote)   Do not add any additional cells in this assignment. If you write additional functions for testing, please remove them before submitting the assignment. All code and comments should be written in between the lines ####IMPLEMENTATION STARTS HERE#### and ####IMPLEMENTATION ENDS HERE####. Please do not remove cell tags (e.g. ‘export’ and ‘test’). Setup Create a conda environment: conda create --name cs7280_env python=3.8 -y Activate the environment: conda activate cs7280_env Dependencies Install the necessary libraries for this assignment after activating your conda environment and navigating to the correct directory. “` bash pip install -r requirements.txt # Imports python import copy import networkx as nx import numpy as np import random import scipy as sp Please don’t add additional import statements and modify this cell # Part 1 [20 pts] Getting Started with NetworkX (US cities network) ## 1.1 [6 pts] Creating graphs with certain properties In the body of the function below, generate the indicated graphs, with each graph satisfying the properties given in the function's docstring. python def graphswithcertain_props() -> (nx.Graph, nx.Graph, nx.Graph, nx.Graph, nx.DiGraph, nx.MultiGraph): “”” Behavior Generates graphs, each of a distinct type Returns simp networkx Graph 10 node graph, undirected, exactly 21 edges, no self loops, no multi-edges cycle networkx Graph 10 node graph, undirected, entire graph is a single cycle clique networkx Graph 10 node graph, undirected, in which the entire graph is a clique star networkx Graph 10 node graph, undirected, is a star direct networkX DiGraph 10 node graph, directed, exactly 9 edges and exactly 1 node of out-degree 0 non_simp networkX MultiGraph 10 node graph, undirected, exactly 21 edges with exactly 10 of those edges being self loops and exactly 11 edges between the same pair of distinct nodes a and b """ ####IMPLEMENTATION STARTS HERE#### #These lines are placeholders simp = nx.Graph() cycle = nx.Graph() clique = nx.Graph() star = nx.Graph() direct = nx.DiGraph() non_simp = nx.MultiGraph() return simp, cycle, clique, star, direct, non_simp ####IMPLEMENTATION ENDS HERE#### ## 1.2 [4 pts] Relationship between algebraic features of a graph In the body of the function below, compute the maximum degree among the nodes of the parameter graph `G`. python def max_degree(G:nx.Graph) -> int: “”” Params G networkx Graph an arbitrary undirected graph Return max_deg int the maximum degree among all nodes of the graph G """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder max_deg = 7280 return max_deg ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute the average degree among the nodes of the parameter graph `G`. python def avg_degree(G:nx.Graph) -> float: “”” Params G networkx Graph an arbitrary undirected graph Return avg_deg float the average degree among all nodes of the graph G """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder avg_deg = 7280.0 return avg_deg ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute the leading eigenvalue of adjacency matrix of the parameter graph `G` and then return that value. You may recall from a course in linear algebra that the leading eigenvalue is the eigenvalue of greatest overall magnitude. For example, if a matrix $A$ has eigenvalues $2, -5, 3i,$ and $-4+5i$, these numbers have respective magnitudes of $2, 5, 3$, and $sqrt{41}approx 6.4$. Thus, the leading eigenvalue is $-4+5i$. python def leading_eigenvalue(G:nx.Graph) -> float: “”” Params G networkx Graph an arbitrary undirected graph Return eig float the 'leading' eigenvalue of the adjacency matrix """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder eig = 7280.00 return eig ####IMPLEMENTATION ENDS HERE#### Let $G$ be a graph and let $lambda_1$ be the leading eigenvalue of the adjacency matrix of $G$. Using the functions you've created above, investigate for various graphs how the following values compare: * Maximum degree of $G$ * Avgerage degree of $G$ * $|lambda_1|$ These three values should satisfy a compound inequality $a leq b leq c$, where $a,b,c$ each uniquely take on one of the values above. In the body of the function below, uncomment the return that aligns the values to what you have found in your investigations. python def algebraic_comparison() -> (str, str, str): “”” Return ??? str the value that takes the place of ‘a’ in the text discussion above ??? str the value that takes the place of ‘b’ in the text discussion above ??? str the value that takes the place of ‘c’ in the text discussion above “”” ####IMPLEMENTATION STARTS HERE#### max_deg = "maximum degree" avg_deg = "average degree" eig = "leading eigenvalue" # return max_deg, avg_deg, eig # return max_deg, eig, avg_deg # return avg_deg, max_deg, eig # return avg_deg, eig, max_deg # return eig, max_deg, avg_deg # return eig, avg_deg, max_deg ####IMPLEMENTATION ENDS HERE#### ## 1.3 [2 pts] Loading graph data from a graphml file and checking it For the rest of Part 1, we will consider a network of 128 cities in North America (mostly the US, along with a few in Canada), with city names starting with some letter (inclusively) between R and Y. Some examples of the city names included in this network are "Ravenna, OH", "Selma, AL, and "Wichita, KS". Between each pair $(u,v)$ of distinct cities $u$ and $v$ there is an edge (listed either as edge $(u,v)$ or edge $(v,u)$). For the purpose of familiarizing yourself with terminology, note that this network is an example of *clique*. Each edge is assigned a `"weight"`, which represents the geographic distance (as the crow flies) between the cities that make up the end points of that edge. In the body of the function below, use NetworkX's [`read_graphml`](https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.graphml.read_graphml.html) function to read the `cities_data.graphml` file located within the data folder. Note that the relative path to the file should be `"data/cities_data.graphml"`. python def loadcitiesdata() -> nx.Graph: “”” Return G networkx Graph “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder, load the city data into the variable G G = nx.Graph() return G ####IMPLEMENTATION ENDS HERE#### **Sanity Check** You can print the return graph of the function below and what should be shown is text of the form: > Graph with 128 nodes and 8128 edges python print(loadcitiesdata()) ## 1.4 [3 pts] Weighted graphs The "cities_data" graph is a weighted graph. That is, each edge in the graph is assigned a number, or *weight*. In NetworkX graph objects (including [`Graph`](https://networkx.org/documentation/stable/reference/classes/graph.html), [`DiGraph`](https://networkx.org/documentation/stable/reference/classes/digraph.html), and [`MultiGraph`](https://networkx.org/documentation/stable/reference/classes/multigraph.html) objects), edges can be assigned properties under any name and data type. These attributes can be accessed using the NetworkX function [`get_edge_attributes`](https://networkx.org/documentation/stable/reference/generated/networkx.classes.function.get_edge_attributes.html). Note that this returns a dictionary, with the keys being the edges (2-tuples) of the graph, and the values being the associated value assigned to that edge. In the body of the function below, write logic so that the function returns the 'weight' value of the particular parameter edge `e` of the parameter graph `G`. If `e` is not an edge of `G`, then raise an exception `Exception("e is not an edge of G")`. python def getedgeweight(G:nx.Graph, e) -> int: “”” Params G networkx Graph an arbitrary undirected graph e 2-tuple a 2-tuple (a,b), where a,b may be either integers or python strings Return e_weight int the 'weight' value assigned to the edge e """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder e_weight = 7280 return e_weight ####IMPLEMENTATION ENDS HERE#### In the body of the function below, write logic so that the function returns the 'weight' (or, in this context, *distance*) between the cities "Youngstown, OH" and "Winston-Salem, NC". python def distfromYOHtoWSNC() -> int: “”” Return distance int returns the ‘weight’ (distance) between Youngstown, OH and Winston-Salem, NC “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder distance = 7280 return distance ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute the number of pairs of cities that are 50 miles or fewer apart. python def numcitieswithin50() -> int: “”” Return numcities int number of pairs of cities that are connected by a road that is 50 or fewer miles long “”” # Do not modify the line below G_cities = load_cities_data() ####IMPLEMENTATION STARTS HERE#### # This is a placeholder num_cities = 7280 return num_cities ####IMPLEMENTATION ENDS HERE#### ## 1.5 [5 pts] Generating subgraphs according to certain conditions Notationally, let $G$ be the graph corresponding to the `cities_data.graphml` data. Let $V = V(G)$ be the node set of $G$ and $E = E(G)$ be the edge set of $G$. In the body of the function below: 1. determine all the cities that are (inclusively) within 100 miles of at least one of the cities in the list `city_list` (which is a list of city names corresponding to the node names is `cities_data.graphml`), 2. construct a new graph $S$ where the node set $V_S subseteq V$ is exactly the set of cities found in the step above, and the edge set $E_Ssubseteq E$ consists of those edges from $E$ that have endpoints both in $V_S$. python def subgraphcitieswithin100of(citylist:list) -> nx.Graph: “”” Params G: NetworkX graph object citylist: list of strings (names of cities in G) Output S: NetworX graph object (subgraph of G that only contains edges between cities in “city_list” and directly neighboring cities that are less than 100 miles away) """ # Do not modify the line below G_cities = load_cities_data() ####IMPLEMENTATION STARTS HERE#### # This is a placeholder S = nx.Graph() return S ####IMPLEMENTATION ENDS HERE#### **Sanity Check** Running the cell below should yield a graph with 7 nodes and 7 edges. python print(subgraphcitieswithin100of([“Toledo, OH”, “Stockton, CA”, “San Francisco, CA”])) # Part 2 [20 pts] Walks and Paths (Les Miserables character network) In Part 2, we will consider the undirected, weighted network of character co-appearances in Victor Hugo’s novel "Les Miserables'' using “lesmis_data.gml”. The nodes represent characters (as indicated by the labels) and the edges connect pairs of characters that haved appeared together in the same chapter at least once. The weights on the edges represent the number of times those two characters are seen together throughout the novel in total. ## 2.1 [2 pts] Loading the Data into a Graph In the body of the function below, use NetworkX's [`read_gml`](https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.gml.read_gml.html) function to read the `lesmis_data.gml` file located within the data folder. Note that the relative path to the file should be `"data/lesmis_data.gml"`. python def loadlesmisdata() -> nx.Graph: “”” Return G networkx Graph “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder # load the lesmis data as a nx.Graph object into the variable G G = nx.Graph() return G ####IMPLEMENTATION ENDS HERE#### ## 2.2 [2 pts] Checking Connectedness In the body of the function below, use NetworkX's [`is_connected`](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.components.is_connected.html) function to determine if the parameter graph `G` is connected or not. python def graphisconnected(G:nx.Graph) -> bool: “”” Params G networkx Graph Return connected bool whether the graph is connected or not """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder connected = False return connected ####IMPLEMENTATION ENDS HERE#### ## 2.3 [6 pts] Analyzing a Network using Shortest Path Lengths In the body of the function below, compute a return a dictionary where the keys are all the possible node pairs (regardless of whether there is an edge between them or not) of the parameter graph `G` and the value for each node pair is the unweighted length of the shortest path(s) between the two nodes. python def shortestpathdict(G:nx.Graph) -> dict: “”” Params G networkx Graph Return shortest_path_length dict dictionary key: node pair (a,b), where a,b are nodes of G value: the shortest unweighted path length (int) from a to b (fewest number of edges along any path between a and b) """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder shortest_path_length = {} return shortest_path_length ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute the average shortest path length among all the shortest path lengths in the parameter graph `G`. python def averageshortestpath_length(G:nx.Graph) -> float: “”” Params G networkx Graph Return aspl float average shortest path length """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder aspl = 7280.0 return aspl ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute the maximum shortest path length among all the shortest path lengths in the parameter graph `G`. python def maximumshortestpath_length(G:nx.Graph) -> int: “”” Params G networkx Graph Return mspl int maximum shortest path length """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder mspl = 7280 return mspl ####IMPLEMENTATION ENDS HERE#### ## 2.4 [4 pts] Random walks (probability) In the body of the function below, compute a dictionary based on the parameter graph `G` where the keys of the dictionary are the node pairs $(u,v)$ of `G` and the value assigned to node pair $(u,v)$ is the probability $p(u,v)$, defined as $$p(u,v) = frac{text{weight of edge between } u text{ and } v}{sumlimits_{w text{ is a neighbor of } u}^{} text{weight of edge between } u text{ and } w}$$ Notes: * Accessing the proper edge attribute for this graph is done via the name `"value"` (as opposed to `"weight"`, as was used in Part 1). * If there is no edge between $u$ and $v$, then consider the weight to be 0. * One should not expect symmetry for $p$. That is, one should expect generally that $p(u,v) eq p(v,u)$. python def prob_dict(G:nx.Graph) -> dict: “”” Params G networkx Graph arbitrary networkx graph Return probs dict dictionary where keys are pairs (a,b) of nodes a,b in G values are the "value" (ie weight) of the edge between a and b OR 0 if there is no such edge """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder probs = {} return probs ####IMPLEMENTATION ENDS HERE#### **Sanity check** Running the cell below should yield a value of about $frac{31}{158}approx 0.196$. python print(probdict(loadlesmis_data())[(“Valjean”, “Cosette”)]) In the body of the function below, use the probability dictionary `prob_dict` to randomly choose a neighbor node from the given `source_node`. python def getrandomnextnode(G:nx.Graph, sourcenode:str) -> str: “”” Params G networkx Graph arbitrary networkx graph source_node str node in G Return next_node str randomly chosen neighbor of source_node chosen with probability related to edge strength """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder next_node = "" return next_node ####IMPLEMENTATION ENDS HERE#### In the body of the function below, compute a single random walk within parameter graph `G` from the source node `source_node` of length `walk_length`. Use the `get_random_next_node` function from above to do so. Return the walk as a *tuple* of length `walk_length + 1` (in a walk traversing $m$ edges, there will be $m+1$ nodes). python def randomwalk(G:nx.Graph, sourcenode:str, walklength:int) -> tuple: “”” Params G networkx Graph arbitrary networkx graph sourcenode str node in G walk_length int length (in edge distance) of the walk Return walk tuple tuple of nodes in G of length walk_length+1 """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder walk = [] return tuple(walk) ####IMPLEMENTATION ENDS HERE#### ## 2.5 [6 pts] Stationary distribution (linear algebra) Following the portion of the Canvas lesson **L2: Random Walks** concerning eigenvalues and eigenvectors, let us consider the inherent ordering of the nodes of the lesmis network and $P$ as the transition matrix determined by the `prob_dict` function above applied to the lesmis network. It follows that the stationary distribution $q$ of the lesmis network is a vector such that $(P^T)q = (1)q$, where $P^T$ is the transpose of $P$, $q$ is a $ntimes 1$ column vector, and $n$ is the number of nodes of the lesmis network. By the definition of eigenvector and eigenvalue, $q$ is an eigenvector of $P^T$ corresponding to the eigenvalue of 1. python def getlesmisP() -> np.ndarray: “”” Return P numpy ndarray transition matrix corresponding to the les mis network “”” # Do not modify the following three lines G_lesmis = load_lesmis_data() G_lesmis_nodes = list(G_lesmis.nodes) n = len(G_lesmis_nodes) ####IMPLEMENTATION STARTS HERE#### # This is a placeholder P = np.zeros(shape=(n,n)) return P ####IMPLEMENTATION ENDS HERE#### In the body of the function below, use NumPy's [`numpy.linalg.eig`](https://numpy.org/doc/2.2/reference/generated/numpy.linalg.eig.html) function to compute and return a right eigenvector of $P^T$ of the lesmis netwwork that is associated with the eigenvalue of 1. If an exact eigenvalue of 1 is not found, choose the eigenvalue closest to 1. Make sure the returned eigenvector is normalized (that is has length 1). You may use [`numpy.linalg.norm`](https://numpy.org/doc/2.2/reference/generated/numpy.linalg.norm.html) for this. Also make sure that the majority of the entries are nonnegative. python def getaneigenvectorof1(M:np.ndarray) -> np.array: “”” Params M numpy ndarray arbitrary 2-dimensional array Return evec_of_1 numpy array 1-dimensional array that is a normalized eigenvector of M corresponding to an eigenvalue of 1 """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder evec_of_1 = np.zeros(shape=(M.shape[0],)) return evec_of_1 ####IMPLEMENTATION ENDS HERE#### In the body of the function below, use NumPy's [`argmax`](https://numpy.org/doc/2.2/reference/generated/numpy.argmax.html) or [`argpartition`](https://numpy.org/doc/stable/reference/generated/numpy.argpartition.html) to find the top three nodes (ie, occuring les mis characters). python def topthreelesmis_chars() -> (str, str, str): “”” Return firstEntry str one of the top 3 occuring characters secondEntry str one of the top 3 occuring characters thirdEntry str one of the top 3 occuring characters “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder top_3_chars = ["Micky", "Donald", "Goofy"] return tuple(top_3_chars) ####IMPLEMENTATION ENDS HERE#### # Part 3 [20 pts] Connectedness in Directed Graphs (Biological neural network) In Part 3, we will look at a directed biological network formed by biological neurons in the optic medulla (a portion of the eye) of a drosophila (a small fruit fly). The nodes of the network are neurons in the optic medulla and the (directed) edges are chemical synapses formed between one neuron and the next. Note the directed nature of the edges due to the bio-chemical structure of biological neurons. ## 3.1 [2 pts] Loading graph data from a graphml file and checking it In the body of the function below, use NetworkX's [`read_graphml`](https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.graphml.read_graphml.html) function to read the `drosophila_medulla_data.graphml` file located within the data folder. Note that the relative path to the file should be `"data/drosophila_medulla_data.graphml"`. python def loaddrosophilamedulla_data() -> nx.MultiDiGraph: “”” Return G networkx MultiDiGraph “”” ####IMPLEMENTATION STARTS HERE#### # this is a placeholder, load the drosophila_medulla data into a graph, G G = nx.MultiDiGraph() return G ####IMPLEMENTATION ENDS HERE#### **Sanity Check** You can print the return graph of the function below and what should be shown is text of the form: > MultiDiGraph with 1781 nodes and 33641 edges python print(loaddrosophilamedulla_data()) ## 3.2 [6 pts] Weakly connected components Using functions from the [Components](https://networkx.org/documentation/stable/reference/algorithms/component.html) suite of NetworkX, fill in logic in the body of the function below so that it: 1. finds the *weakly* connected components of this network, 2. stores the number of these components in the variable `wccs` 3. stores the component with the highest node count as a subnetwork in variable `LWCC`, 4. finds the ratio of nodes stored in `LWCC` versus the parameter network `G` and stores that value in variable `wpct` python def weak_connectedness(G:nx.MultiDiGraph) -> (int, float, nx.Graph): “”” Params G networkx MultiDiGraph Return wccs int number of weakly connected components in the graph wpct float percent of nodes in the graph that belong to the largest weakly connected component LWCC networkx Graph largest weakly connected component """ ####IMPLEMENTATION STARTS HERE#### # These lines are placeholders wccs = 7280 wpct = 0.7280 LWCC = nx.Graph() return wccs, wpct, LWCC ####IMPLEMENTATION ENDS HERE#### ## 3.3 [6 pts] Strongly connected components Using functions from the [Components](https://networkx.org/documentation/stable/reference/algorithms/component.html) suite of NetworkX, fill in logic in the body of the function below so that it: 1. finds the *strongly* connected components of this network, 2. stores the number of these components in the variable `sccs` 3. stores the component with the highest node count as a subnetwork in variable `LSCC`, 4. finds the ratio of nodes stored in `LSCC` versus the parameter network `G` and stores that value in variable `spct` python def strong_connectedness(G:nx.MultiDiGraph) -> (int, float, nx.MultiDiGraph): “”” Params G networkx MultiDiGraph Return sccs int number of strongly connected components in the graph spct float percent of nodes in the graph that belong to the largest strongly connected component LSCC networkx MultiDiGraph largest strongly connected component of G """ ####IMPLEMENTATION STARTS HERE#### # These lines are placeholders sccs = 7280 spct = 0.7280 LSCC = nx.MultiDiGraph() return sccs, spct, LSCC ####IMPLEMENTATION ENDS HERE#### ## 3.4 [6 pts] Weak connectedness vs Strong connectedness In the body of the function below, use the `weak_connectedness` and `strong_connectedness` frunctions to compute three ratios: 1. `ratio_size`. This ratio should be the size (number of nodes) of the largest strongly conneced component divided by the size of the largest weakly connected component, 2. `ratio_mspl`. This ratio should be the maximum shortest path length of the largest strongly connected component divided by the maximum shortest path length of the largest weakly connected component, 3. `ratio_aspl`. This ratio should be the average shortest path length of the largest strongly connected component divided by the average shortest path length of the largest weakly connected component, python def ratiosstrongover_weak(G:nx.MultiDiGraph) -> (float, float, float): “”” Params G networkx MultiDiGraph Return ratio_size float explained above ratio_mspl float explained above ratio_aspl float explained above """ ####IMPLEMENTATION STARTS HERE#### # These lines are placeholders ratio_size = 0.0 ratio_mspl = 0.0 ratio_aspl = 0.0 return ratio_size, ratio_mspl, ratio_aspl ####IMPLEMENTATION ENDS HERE#### # Part 4 [20 pts] Topological Ordering (Programming languages network) In Part 4, we will consider a influence network among programming languages. Each node in the network is a programming language. Each directed edge in the network signifies that the source programming language influenced the target programming language in some critical way. The data for this network is contained in the `language_data.txt` file. Each line of this text file represents one edge. If a line reads `language_A language_B` (separated by exactly one space), then this means that `language_A` influenced `language_B`. In the influence network we will be using, there should be a directed edge from source node `language_A` to target node `language_B`. To confirm that it should be this way, observe that one of the lines in the `language_data.txt` file reads `c c++`. It is commonly known that C++ was developed as an extension to the C language - hence, C influenced the development of C++. ## 4.1 [2 pts] Loading graph data from a txt (edgelist) file In the body of the function below, use NetworkX's [`read_edgelist`](https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.edgelist.read_edgelist.html) function to read the `language_data.txt` file located within the data folder. Note that the relative path to the file should be `"data/language_data.txt"`. python def loadlanguagedata() -> nx.DiGraph: “”” Return G networkx DiGraph “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder G = nx.DiGraph() return G ####IMPLEMENTATION ENDS HERE#### **Sanity Check** You can print the return graph of the function below and what should be shown is text of the form: > DiGraph with 361 nodes and 735 edges python print(loadlanguagedata()) ## 4.2 [2 pts] Directed acyclic graph (DAG) In the body of the function below, use NetworkX's [`is_directed_acyclic_graph`](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.dag.is_directed_acyclic_graph.html) function to determine whether an arbitrary (directed) parameter graph `G` is a directed acyclic graph (DAG). Return a boolean value representing if G is a DAG. Notes: * 'acyclic' simply means 'no cycles,' * The function below should return a bool value based on the parameter graph `G`. The function is not explicitly tied to the language network. python def isgraphdag(G:nx.DiGraph) -> bool: “”” Params G networkx DiGraph arbitrary directed graph Return is_dag bool True if parameter graph G is directed and acyclic False otherwise """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder is_dag = None return is_dag ####IMPLEMENTATION ENDS HERE#### ## 4.3 [6 pts] Topological generations of a DAG In the body of the function below, 1. Check that the parameter graph `G` is a DAG. If not, return two empty dictionaries. Otherwise, continue to step 2. 2. Using the NetworkX function [`topological_generations`](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.dag.topological_generations.html) compute a dictionary, where the keys are integers representing the topological layer, and each value is the list of nodes found in that topological layer. For example, in the 0 layer should be all the so-called *start nodes* that have 0 in-degree. python def topgendicts(G:nx.DiGraph) -> (dict, dict): “”” Params G networkx DiGraph arbitrary directed graph Return top_gen dict dictionary where the keys are integers 0, 1, 2... representing topological layers values are lists of nodes that are in that topological layer node_gen dict dictionary where the keys are nodes of G values are the topological layer integer the node is in """ ####IMPLEMENTATION STARTS HERE#### #These are placeholders top_gen = {} node_gen = {} return top_gen, node_gen ####IMPLEMENTATION ENDS HERE#### ## 4.4 [4 pts] Topological ordering of nodes in a DAG In the body of the function below, 1. Check that the parameter graph `G` is a DAG. If not, return an empty list. Otherwise, continue to step 2. 2. Using your function `top_gen_dicts` above, fill in the body of the function below with a computation that will return a list of nodes of the parameter graph `G` that are a topological ordering. In order to receive full credit for this subpart, the function below must call `top_gen_dicts` on the parameter graph and present an ordering that exactly matches the natural ordering provided by that dictionary. NOTE: You do not need to sort the lists in `top_gen` from `top_gen_dicts`. The order of nodes in each list comes directly from `topological_generations`, which already respects the DAG’s dependencies and provides a valid topological order. Sorting would violate this natural ordering. python def atopologicalordering(G:nx.DiGraph) -> list: “”” Params G networkx DiGraph arbitrary directed graph Return top_order list a topological ordering of the nodes of G, must be based on `topological_generations_dict` """ ####IMPLEMENTATION STARTS HERE#### # This is a placeholder top_order = [] return top_order ####IMPLEMENTATION ENDS HERE#### ## 4.5 [6 pts] Start nodes and highest total influence In the body of the function below, 1. Check that the parameter graph `G` is a DAG. If not, return an empty dictionary and empty string. Otherwise, continue to steps 2 and 3. 2. Using your function `top_gen_dicts` above, for each node in the 0-layer topological layer (these nodes are called *start nodes*), find the total influence of that node in the graph and store it in a dictionary, where the key is the start node name and the value is an integer representing its total influence. The total influence of a start node is equal to the number of descendants the node has at *any distance* from it. For example: * C influenced C++, * C++ influenced C#, and * C# influenced Rust, * so we would consider C to have influenced Rust. 3. Return the name of the node with the highest influence. python def start_nodes(G:nx.DiGraph) -> (dict, str): “”” Params G networkx DiGraph arbitrary directed graph Return start_node_dict dict dictionary start_node_highest_inf str """ ####IMPLEMENTATION STARTS HERE#### # These are placeholders start_node_dict = {} start_node_highest_inf = "" return start_node_dict, start_node_highest_inf ####IMPLEMENTATION ENDS HERE#### # Part 5 [20 pts] Bipartite Graphs and Projections (Github network) The final dataset is a bipartite network based on the association of github users and projects. The `language_data.txt` file contains lines that specify associations between a user on the left and a project on the right. The set of user nodes make up a collective "left side" of the network and the set of project nodes make up a collective "right side" of the network, with every edge having a user as one end point and a project as the other. ## 5.1 [2 pts] Checking if a graph is bipartite In the body of the function below, use NetworkX's [`is_bipartite`](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.bipartite.basic.is_bipartite.html#networkx.algorithms.bipartite.basic.is_bipartite) function to check whether the parameter graph `G` is bipartite or not. python def isgraphbipartite(G:nx.Graph) -> bool: “”” Params G networkx Graph arbitrary graph Return graphisbp bool boolean determined by whether parameter graph G is bipartite “”” ####IMPLEMENTATION STARTS HERE#### # This is a placeholder graph_is_bp = None return graph_is_bp ####IMPLEMENTATION ENDS HERE#### ## 5.2 [6 pts] Loading graph data from a txt (edgelist) file In the body of the function below, 1. use NetworkX's [`read_edgelist`](https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.edgelist.read_edgelist.html) function to read the `github_data.txt` file located within the data folder. Note that the relative path to the file should be `"data/github_data.txt"`, 2. store all user nodes in the `user_list` variable, and 3. store all project nodes in the `project_list` variable. python def loadgithubdata() -> (nx.Graph, list, list): “”” Return G networkx Graph arbitrary graph userlist list list of user nodes projectlist list list of project nodes “”” ####IMPLEMENTATION STARTS HERE#### G = nx.Graph() user_list = [] project_list = [] return G, user_list, project_list ####IMPLEMENTATION ENDS HERE#### **Sanity Check** You can print the return graph of the function below and what should be shown is text of the form: > Graph with 177386 nodes and 440237 edges > 56519 > 120867 python print(loadgithubdata()[0]) print(len(loadgithubdata()[1])) print(len(loadgithubdata()[2])) ## 5.3 [2 pts] Biadjacency matrix of a bipartite graph In the body of the function below: 1. Check that all the conditions below are met. If at least one of them is not met, return a `None` value. Otherwise continue on to step 2: * The parameter graph `G` is bipartite, * The `row_nodes` argument is nonempty and each element is a node of `G`, * The `column_nodes` argument is nonempty and each element is a node of `G`. 2. Using NetworkX's [`biadjacency_matrix`](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.bipartite.matrix.biadjacency_matrix.html) compute and return the biadjacency matrix of the parameter graph `G`. Use NetworkX's [`tolil`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_array.tolil.html#scipy.sparse.csr_array.tolil) method to convert the biadjacency matrix to the `sp.sparse._arrays.lil_array` format. python def getbiadjacencymatrix(G:nx.Graph=nx.Graph(), rownodes:list=[], columnnodes:list=[]) -> sp.sparse.arrays.lilarray: “”” Params G networkx Graph an arbitrary NetworkX graph rownodes list a list of the “left side” nodes columnnodes list a list of the “right side” nodes Return B sp.sparse._arrays.lil_array the biadjacency matrix of the graph (in lil_array format) """ ####IMPLEMENTATION STARTS HERE#### # These are placeholder lines B_test = sp.sparse.lil_array((2,3)) for i in range(B_test.shape[0]): for j in range(B_test.shape[1]): B_test[i,j] = i + j return B ####IMPLEMENTATION ENDS HERE#### ## 5.4 [4 pts] Matrix products For a general $mtimes n$ matrix $B$, we *cannot* matrix-multiply $B$ by itself if $m eq n$. However, note that we *can always* matrix-multiply $B$ by $B^T$ since $B$ is a $mtimes n$ matrix and $B^T$ is a $n times m$ matrix (since the "inside dimensions" match), giving a resultant $mtimes m$ matrix $BB^T$. Similarly, we can matrix-multiply $B^T$ by $B$ to get a $ntimes n$ matrix $B^TB$. In the body of the function below, use the [`transpose`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.transpose.html) method to transpose a scipy sparse array and the [`@`] symbol to perform matrix multiplication between two scipy sparse arrays. More information about scipy sparse arrays can be found [here](https://docs.scipy.org/doc/scipy/reference/sparse.html). python def sparsearrayproducts(B:sp.sparse.arrays.lilarray) -> (sp.sparse.arrays.lilarray, sp.sparse.arrays.lilarray): “”” Params B sp.sparse.arrays.lilarray Return prod_BBT sp.sparse._arrays.lil_array prod_BTB sp.sparse._arrays.lil_array """ ####IMPLEMENTATION STARTS HERE#### # These are placeholder lines B_test = sp.sparse.lil_array((2,3)) for i in range(B_test.shape[0]): for j in range(B_test.shape[1]): B_test[i,j] = i + j prod_BBT = copy.deepcopy(B_test) prod_BTB = copy.deepcopy(B_test) return prod_BBT, prod_BTB ####IMPLEMENTATION ENDS HERE#### ## 5.5 [6 pts] One mode projections and greatest commonality The biadjacency matrix $B$ of the gibhub network is a $mtimes n$ matrix, where $m$ is the number of users in the network and $n$ is the number of projects in the network. That is, the rows represent the users and the columns represent the projects. Alternatively, the matrix $B^T$ is a $ntimes m$ matrix where the rows represent the projects and the columns represent the users. The matrix-product $BB^T$ is a $mtimes m$ matrix that is the adjacency (note, *not* *bi*-adjacency) matrix of a graph where the nodes consist only of users. This graph is called the one-mode projection on users. Alternatively, the matrix-product $B^TB$ is a $ntimes n$ matrix that is the adjacency matrix of a graph where the nodes consist only of projects. This graph is called the one-mode project on projects. Another way to look at the product $BB^T$ is that it is the number of walks of length 2 from one of the user nodes back to a user node. In the case of the github network, which is bipartite, the only way to have a walk of length 2 from a user node back to a user node is for the walk to pass through exactly one project. Thus, the $(i,j)$-entry of $BB^T$ is the number of projects that user i shares with user j. In the body of the function below, use the information above to compute the pair of users who share the greatest number of projects and the pair of projects that share the greatest number of users. python def greatestgithubcommonality() -> (tuple, tuple): “”” Return userpair 2-tuple pair of users that share the most projects projectpair 2-tuple pair of projects that share the most users “”” ####IMPLEMENTATION STARTS HERE#### # These are placeholders G_github, user_list, project_list = load_github_data() user_pair = ("Yo", "wassup") project_pair = ("Hello", "world") return user_pair, project_pair ####IMPLEMENTATION ENDS HERE#### “`  

$25.00 View

[SOLVED] Cpe2600 lab 2- building program

Introduction The purpose of this assignment is to explore how programs are built in the C programming language. You will be guided through how to use the GNU C Compiler (gcc) as well as write several programs and see what steps are taken to translate them from source into an executable. Throughout the exercise, collect screen shots and and answer the questions posed [to a Word document]. Be sure to include the questions you are answering along with your answers (e.g. copy question from exercise and add your answer). You will need to submit this to earn credit for the exercise. Objectives By the end of this assignment you will be able to: • Create C source files • Link C source files together through header files and include statements • Use preprocessor directives • Display and analyze the output at each step in the compilation of a C program Background and References In order to execute a program written in the C programming language, it must first be ‘compiled’. To compile something means to “to translate instructions from one computer language into another so that a particular computer can understand them” ~ https://www.oxfordlearnersdictionaries.com/us/definition/english/compile. In our case, that means translating the C source code into instructions that can be executed by our CPU. The compiler that we will be using is called gcc, which stands for the GNU C compiler. GNU is a recursive acronym which stands for GNU is not unix, chosen because the behavior of GNU tools are ‘Unix like’ in the sense that they follow the design of Unix but are free and open source and contain no Unix source ~ https://www.gnu.org/. Phases of Compilation In this section you will be analyzing the output at each phase of compilations. The GNU C compiler utilizes four (4) phases of compilation: 1. Preprocessing 2. Compiling 3. Assembling 4. Linking At each phase of compilation, gcc performs a portion of compilation tasks, the output of each phase is ‘fed in’ as input to the next phase, with the final phase outputting the program executable. For this activity start with the source for hello world: /********************************** * hello.c * * A hello world program in c ***********************************/ #include int main (int argc, char* argv[]) { // Print a message to the user printf(“Hello World! “);// Return to exit the program return 0; } Preprocessing Including Files In this phase, gcc takes the C source input and ‘pre-processes’ it. The input is C source code and the output is also C source code. The difference in the code after preprocessing is that all preprocessor ‘directives’ have been handled and the resulting source is ready to be compiled. More details on the gcc preprocessor (cpp) can be found here: https://gcc.gnu.org/onlinedocs/cpp/. SUBMISSION REQUIREMENT: Research the gcc preprocessor. 1. Determine what command ‘flag’ parameter that needs to be specified to force gcc to stop after the preprocessor phase. 2. What is the purpose of the #include preprocessor directive? How does output of the preprocessor differ from your input source? 3. Write the exact command necessary run the preprocessor on hello.c and output the result to hello.i. Include the screenshot of a portion of this file in your submission. ⓘ There are a couple of ways to accomplish this. Perhaps refer to last week’s exercise. The behavior of one of the ways of doing this is a bit different than the -c or -S options. Macros One useful preprocessor directive is called a macro. This allows you to define a constant value or operation that is in multiple places in the code. When referenced, the preprocessor does a direct text replacement of the name with what it is defined as. For example, you want to define a constant for the length of an array to be used later. Using a preprocessor macro for this would look like: ⓘ We have not yet talked about arrays – we will. Simple enough… Similar to Java, but, you rarely used native arrays in Java (for good reason). // Exploring #define #define ARRAY_LENGTH 10int main(int argc, char* argv[]) { int array1[ARRAY_LENGTH]; int array2[ARRAY_LENGTH];for(int i = 0; i < ARRAY_LENGTH; i++) { // Do something cool with array1 and array2 } } If you need to change the length of the arrays, you just need to change the value within the #define and rebuild your program. SUBMISSION REQUIREMENT: Research the gcc preprocessor macros. 1. Run gcc on this source and force it to stop at the preprocessor output phase. What do you observe about the output? – Save your output and include it with your submission. 2. Can you foresee any issues with this mechanism? As an experiment, change ARRAY_LENGTH from 10 to “xyz” (include quotes). Compile and look at the error message(s). Are the error messages clear? Compiling In this phase, gcc takes the pre-processed C source and translates it into assembly language for the target CPU architecture. SUBMISSION REQUIREMENT: Research the gcc compiler. 1. From your research, determine what command ‘flag’ parameter that needs to be specified to force gcc to stop after the compiler phase. 2. Write the exact command necessary run the compiler on hello.c and output the assembly language to hello.s. Include a snip this file in your submission. Assembling While the assembly language output from the compilation phase is written in CPU instructions, it remains in ‘plain text’ and can not be interpreted directly by the CPU. In the assembly phase, gcc takes the assembly language and ‘assembles’ it into the binary representation that can be executed by the CPU. More details on the gcc assembler (as) can be found here: https://sourceware.org/binutils/docs-2.23.1/as/index.html. The output of the assembly phase is call an “object” file. The object file contains executable “machine” code, but by itself is not yet an executable program. There is one more phase in the build process. SUBMISSION REQUIREMENT: Research the gcc assembler. 1. Determine what command ‘flag’ parameter that needs to be specified to force gcc to stop after the assembler phase. 2. Write the exact command necessary run the gcc on hello.c and output the object file to hello.o. Include a snip this file in your submission. Linking This is the final phase of of a build. The output from this phase is a working executable that can be executed directly from the Linux command line. More details on the gcc linker (ld) can be found here: https://sourceware.org/binutils/docs-2.23.1/ld/index.html. SUBMISSION REQUIREMENT: Research the gcc linker. From your research: 1. What is static linking vs dynamic linking? 2. Determine what ‘flag’ parameter needs to be specified to force gcc to perform static linking. 4. Compare the file sizes of the statically linked executable to a dynamically linked executable. Include a screenshot showing the relative sizes of both executables. Why are they different? Building Programs from Multiple Files While putting all our code in one single source file would work, doing so does not promote good code organization and code reuse. As such, it is often useful to create a program from multiple files. SUBMISSION REQUIREMENT: Create a program for calculating mathematical operations. This program will be in two (2) source files. 1. mymath.c – will contain the implementation of two functions for math operations: – int maxvalue(int bits) – returns the maximum value that can be represented by bits bits. ⓘ Do not overthink these functions – they each can be implemented with a single line of code. 2. main.c – contains the main function. Your main should exercise the functions in mymath.c with a variety of arguments. At a minimum, you should display the maxvalue of 8, 16, and 32 bits, and display numbits for 7, 8, 15, and 16. Print the results of each call to the console. In your main.c and/or mymath.c experiment with using preprocessor macros to define constant values. ⓘ Make sure all code fits the style guidelines and contains all necessary comments and comment blocks. SUBMISSION REQUIREMENT: Research the gcc compiler: 1. From what you have learned in class and through your research how do you build a program from multiple files? 2. What additional file is needed in order to avoid warnings or errors? 3. What additional argument is needed for gcc to properly build the project? Why? Where did you find this documented? Create all the necessary files along with the exact command you needed to run to build your math program. Be sure (out of habit) to inclide the command line option for “all” warnings even though it is not explicitly necessary. Include screen caps of all of your files as well as the output of your program with your submission. Deliverables

$25.00 View

[SOLVED] Cs2340 assignment#2

CS/SE 2340 – Assignment#2 a. B[4] = A[4+2i] ; b. B[4+4i] = A[4] ; c. B[8i+2] = A[2g+h] +h; 2-If we assume we place the following MIPS code starting at location 4000 in memory, what is the MIPS machine code for this code? (Please show binary and decimal value for each field) Assume that i is associated to registers $t0, size is associated to registers $a1, and the base of the array save is in $a0. Please also explain each instruction and specify its type ( R format, I format, or J format). Finally convert the following code to C.* Please change the first instruction to add $t0, $zero, $zero 3-Write MIPS assembly for the following function. Assume N is passed to your function in register $a0. Your output should be in register $v0 at the end of your function. Note: You must implement this function recursively. The purpose of this assignment is to learn how to manipulate the stack correctly in MIPS. int Myfun (int N) { if (N>0) return ( Myfun(N-1) + N ); else return 0 } Please explain each instruction with a comment. Please submit your source code and a screenshot that shows the registers with correct output value for N=10, i.e., Myfun(10) returns 55 . 4-Translate the following program to MIPS assembly program (Please explain each instruction in your code by a comment and submit a .asm file)5-Translate the following program to MIPS assembly program (Please explain each instruction in your code by a comment and submit a .asm file)

$25.00 View

[SOLVED] Cpe2600 lab 13- final project

Lab 13 Final Project Overview As we close out the semester it is time to put your new knowledge to work. For this final lab, create a C program of your own design and purpose. While it would be ideal to incorporate all the major topics of the course, that would be impractical. Rather, focus on a problem you wish to solve and then apply appropriate techniques that you have learned to create a solution. Basic Requirements • Your project will reside in a GitHub repository (assignment accepted through GitHub Classroom – link to assignment: https://classroom.github.com/a/mR54GZeG. If working with another student, one of you should accept the assignment and then figure out how to add the other students as another developer on that repo. • Your project must use a Makefile. • Your repo must include a README.md file in markdown format that documents the project and includes sample input/output/usage, possibly including screen captures. Be sure to clearly to explain the purpose of the program. • Your repo should include a proper .gitignore. • All source code must comply with course documentation standards. • Your project should build without warnings and not have any detectable memory leaks. • If your project uses command-line arguments (preferred over interactive), you should consider using getopt to process arguments. • Your project should incorporate some aspect that we did not explore in any other assignment, such as a new library, a new IPC technique, or type of system call. • Your project should read and/or write files if appropriate for the application. • Your project should use dynamic memory or shared memory if appropriate for the application. • Although not normal, if you have any input file(s) needed for operation, include them in repo and document them in the README.md. Some Ideas • Image Pipeline o Perform image processing of several functions such as ▪ Color Space Conversion (if color) ▪ Enhancement ▪ Median Filter ▪ Histogram Equalization ▪ Color Space Conversion • Video Processing o Explore reading and writing a video stream and perform simple processing on the set of images.• Voting machine o Implement a machine that gathers votes from different terminals with username or PIDs • Console printer to screen queue o receiving user reads one by one as queue up from multiple other sources • Producer/consumer problem for bank account o Handle several transactions accurately from multiple sources • Chat program between 2 terminals o This could be two terminals on the same laptop o or over UDP/TCP or Unix sockets • 2D Wavelet transform on an image o Demonstrate the reduce size low pass image • Discrete Fourier Transform DFT then zero out bins to filter the signal and Inverse DFT o Use the rows of an image to perform the filtering with DFT • Control System implementation o First process captures ADC counts o Second process converts the ADC counts to Temperatures, Pressures, level o Third process uses the Temperatures, Pressures, level, to control 2 loops o Fourth process State machine determines actuator positions from input values o Final process writes DAC value (0-10V) to control the machine Deliverable • When you are ready to submit your assignment prepare your repository: • Make sure your name, assignment name, and section number are all files in your submission – in comment block of source file(s) and/or at the top of your report file(s). • Make sure you have completed all activities and added all new program source files to repository. • Make sure your assignment code is commented thoroughly. • Make sure all files are committed and pushed to the main branch of your repository. To submit, copy the URL for your repository and submit the link to associated Canvas assignment. If working with other students, all students must submit the repo link to Canvas.

$25.00 View

[SOLVED] Cpe2600 lab 11 -multiprocessing

Lab 11 Multiprocessing Acknowledgement: Many aspects of this assignment are modeled after this: Operating Systems Principles (fiu.edu). See additional acknowledgements in the template code. Topical Concepts Multiprocessing Although there are a number of benefits of multiprogramming, the most obvious is to achieve speedup of lengthy operations. To really explore these benefits, we have to have a lengthy operation. How about generating an image of a fractal? The Mandelbrot set is one such calculation that can be made and easily visualized. Depending on a variety of parameters, generation of a single image of the Mandelbrot set can take up to a few seconds on modern computing hardware. To aid in visualization often many images are calculated by varying parameters such as origin or scale factor then pieced into a movie. Examples can be seen here: Mandelbrot set – Wikipedia. The primary goal of this phase of the project will employ multiprocessing to enjoy a speedup while generating Mandelprot plots. A program in the starter repository (mandel) has been provided that features a command line interface which allows adjustment of various parameters and will then generate a Mandelbrot set visualization and save it to a jpeg file. Note that mandel uses ‘getopt’ to process command line arguments. You will need to use getopt for your new program or extend it to accommodate the new command line options if you choose to modify the existing program. Once you have your 50 frames (or more if you wish) you can stitch them together into a movie using ffmpeg (you will need to install – sudo apt update / sudo apt install ffmpeg). Note, you will also need to install a development library to handle jpeg files to build mandel (sudo apt install libjpeg-dev). Once everything is working, use the time command to measure the time it takes to generate the 50 images with various allowed numbers of children processes (1, 2, 5, 10, 20 for example). Plot the results. Does using more processes always speedup the operation? See below for the deliverable. In Lab 12 we will investigate a multithreaded version. Git Source Version Control – Next Steps Branches So far we have been using git in its simplest form, really just as a tool to track a “linear” set of changes to a software project. We have an easy way to track changes, to mark milestones, and to undo changes. This is quite useful by itself if you are a solo developer and are not really sharing your code or repository with anyone. In reality, you will rarely be a solo developer, and your repository will likely be accessed by other developers and stakeholders. So, we must introduce additional features to deal with more complicated scenarios. The first new concept is that of a branch. Like many things, this can get very complicated, so we will just be scratching the surface here as well. So far, we have just a single branch in our repository. By default, it is named “main.” As we made changes, we committed those changes to the main branch. However, if we are working with other developers or even just publishing our repository for public consumption of our project, there is a expectation that the main branch is always deployable. That is, a tested, stable version of the project. In addition, the numerous commits that you make leading up to that next stable version can ultimately create a lot of clutter. So, you should be doing your actual development on a branch, leaving the main branch unchanged until the new features developed are fully tested and ready to deploy on the branch. At that point the branch can be merged back to the main branch. Some guidelines suggest maintaining a parallel development branch and only merge to main for releases. With the -b option, this will create and switch to the new branch. Any commits made now will be on the branch. You can go back to the main branch with another git checkout command, but be careful. Any commits made on the main branch will complicate merging the branches later. We can commit to this branch now, but, technically the branch does not exist in the remote repository, so right now, at least, we cannot push. To do that, there is one additional step we need to take. The following command will push to remote and specifically make the same branch on the remote repository.Now we can commit changes locally and push to remote the same way we have always done. Until we checkout a different branch or commit in the repo, we will be adding all commits on the branch (which is what we want). Interestingly, as long as no commits are made to main, commits will still appear “linear.” If you were to graph commits, you might see this:Merging Ultimately, changes made on a branch will need to be merged back onto the main branch. As stated, ideally, no commits have been made to main since you are doing your work on a branch. So, the process of a merge is pretty trivial. Be sure that you have no uncommitted changes when you start a merge, or you might wind up with some unintended issues. If your current location (HEAD) is still on the branch, and you attempt to merge with main, nothing really happens because, although it is called a branch, if no commits have been made on main, there is no bifurcation in the history. To successfully merge to main, you will need to change your location to the main branch (checkout) and issue the merge from there.Note the message – “Fast-forward”. This basically acknowledges no changes were made to main since the branch. If changes had been made, it would be a “true merge” or a “three-way merge” which has a potential for conflicts. If there are conflicts you will have the opportunity to under the merge or proceed. If you proceed, your files will be marked where there are conflicts and you will need to manually fix them. A pain in a large project, for sure. You can push to remote when done and have your new “release” ready to go. The Exercise Specific Instructions 1. Accept the new GitHub Classroom assignment at https://classroom.github.com/a/0nNkoC5K and clone the repository.2. Before making any changes, create a branch name “labWeek11dev” and switch to that branch. Push the branch to the remote repo as shown above.5. Collect runtime for 1, 2, 5, 10, and 20 children processes. Plot into a graph with # processes on the X axis and runtime on the Y axis. Prepare a brief report in README.md (edit the one in your repository). In the report, include:a. A brief overview of your implementation. b. The graph of your runtime results. You will need to export the plot (from Excel) into an image, add the image to your repo, and then link it into the README.md. c. A brief discussion of your results.6. Create a movie of your 50 images. You can use the ffmpeg tool – this command should work: ffmpeg -i mandel%d.jpg mandel.mpg *** We will demo a few people’s movie in the lab. Include your best movie in your repo for review and grading. *** Deliverable • When you are ready to submit your assignment prepare your repository: • Make sure your name, assignment name, and section number are all files in your submission – in comment block of source file(s) and/or at the top of your report file(s). • Make sure you have completed all activities and added all new program source files to repository. • Make sure your assignment code is commented thoroughly. • Make sure all files are committed and pushed to the main branch of your repository. • Tag your repo with the tag “`vFinal“` • Include your best movie in the repo (we would not normally include such a file in a repo but for grading we do) ***NOTE***: Do not forget to ‘add’, ‘commit’, and ‘push’ all new files, branches, and changes to your repository before submitting. To submit, copy the URL for your repository and submit the link to associated Canvas assignment and add a comment on Canvas that you have completed Lab 11.

$25.00 View

[SOLVED] Cpe2600 lab 1-wsl setup

Lab 1 WSL Setup Introduction The purpose of this assignment is to set up the Windows Subsystem for Linux (WSL). WSL acts as a virtual machine that will allow you to develop code on a Linux based operating system while working in your Windows environment. By completing this exercise, you will have the ability to install Linux within WSL, create a program, and start learning about operating systems based on Linux. For more information on WSL see the documentation on Microsoft’s web page: https://docs.microsoft.com/en-us/windows/wsl/ The deliverable will be a series of screen-caps depicting various steps of the setup procedure with occasional commentary. Collect the screen caps and commentary in a Word doc as you go and you will submit the completed assignment in pdf format to Canvas. The submission is expected prior to departing the lab period in Week 1. Objectives By the end of this assignment you will be able to: • Create a fresh installation of Linux • Understand the concepts of the Windows Subsystem for Linux (WSL) • Demonstrate an ability to use a Linux shell. • Use the ‘man’ command to obtain documentation about Linux commands • Explain how to list the contents of a directory in multiple forms. • Navigate the Linux file system by changing directories. • Manage the creation and deletion of new files and directories from within the command shell. Background and References Unix is a multitasking and multiuser operating system originally developed in the 1960’s. Since its original creation it has received many updates and today many variants of the operating system exist all of which follow the principles of the UNIX Philosophy of minimalist OS design and modular development. The interfaces provided by the Unix operating system have been standardized by the IEEE as the Portable Operating System Interface (POSIX). The idea is by following certain rules a software application can be executed on any POSIX compliant operating system without having to be modified. Of course, different CPU architectures would require the software to be rebuilt to execute a different instruction set architecture, but the source for the application could remain the same. In 1991, as a personal project Linux Torvalds began working his own variant of Unix to be published as free and open software. This operating system eventually became Linux, a portmanteau of Linus and Unix. While not as popular as a desktop operating system, Linux is utilized on many server platforms, embedded systems, and is the basis for the Android operating system. While often you hear of Linux described as an operating system, Linux itself is not an operating system, but a kernel. Many operating systems have been built using the Linux kernel. These operating systems (e.g. Ubuntu, Mint, Arch, Raspberry Pi OS) are often called ‘distributions’. These operating systems bundle the Linux kernel with additional applications and services. In this class, we will be learning and using POSIX services provided by the Linux kernel. In order to use a Linux distribution, there are several options available: • Find a computer and install a Linux distribution on it • Install a Linux distribution in a virtual machine such as VMWare or VirtualBox • Use the Microsoft Windows Subsystem for Linux (WSL) which allows a Linux distribution to run within the environment of Windows. For this class, we will be using the WSL since your MSOE issued laptops already have Windows installed and working with a virtual machine can be a bit cumbersome. But, as mentioned earlier, any application written to the POSIX interfaces will work on any variant of Unix. Install Windows Subsystem for Linux (WSL) The following procedure is documented in Microsoft’s WSL install web page: https://docs.microsoft.com/en-us/windows/wsl/install To install a Linux distribution in WSL, you need to perform 2 things: 1. Install the WSL variant of the Linux kernel 2. Install the distribution for the Linux operating system Microsoft makes this easy by doing both in a single step. By default, the WSL installer installs the Ubuntu Linux distribution, which is sufficient for the work we’ll be doing. If you want to look into installing other distributions, you can find information on the Microsoft WSL install web site: https://docs.microsoft.com/en-us/windows/wsl/install#ways-torun-multiple-linux-distributions-with-wsl To install WSL and Ubuntu: 1. Open a command window as Administrator 1. Open the start menu and type cmd.exe 2. In the menu on the right, click: Run as Administrator 2. When the command prompt opens run wsl –install This might take a while as WSL installs the kernel and the Linux distribution. It will default to the Ubuntu distribution, although others are available. It will also default to WSL 2. You’ll see status messages printed to the command prompt as it goes. Once it has completed installation, you will most likely be required to reboot your PC. Once your PC restarts, you will be asked to create a username and password for your Linux distribution. This does not need to be the same as your Windows login, but it might be easier to remember if you do set it to the same. The choice is yours. File System Setup Once the installation completes, and you create a username and password you will see a slightly different command prompt. You are now running in the Ubuntu Linux environment. Any command you type will default to a Linux command. In addition, the Linux subsystem has its own filesystem that is separate and distinct from the Windows filesystem. The Windows filesystem can be accessed from WSL and vice versa. Accessing Windows Files from Linux On Windows, the file system is typically NTFS and is organized by drive (e.g. C: or D: etc.). The file system works slightly different on Linux. Directories (i.e Folders) are ‘mounted’ inside each other. You can think of it like a book on a bookshelf. Each disk drive on your system is like a book and the file system as a whole is like the shelf. In Windows, each book is given a drive letter, while in Linux each book is given a directory name. For example, to access your Windows file system from Linux you have to go where it is mounted (e.g. what shelf it is on). In WSL, the mount point for each drive is in /mnt. NOTE: that Linux uses forward slash ‘/’ to delimit a directory while Windows uses the backslash ’’. The cd command allows you to change directories while the ls command allows you to list the contents of a directory from the command line. user@machine:~$ ls /mnt c wsl user@machine:~$ This lists the contents of the /mnt directory. In this example, there are two things c which represents the C: ‘drive’ on Windows and another directory wsl which WSL uses for device mapping. You can access any file in your Windows C: drive from Linux though /mnt/c/. For example, to list the contents of your Windows desktop you’d type: user@machine:~$ ls /mnt/c/Users/WINDOWSUSERID/ Replacing WINDOWSUSERID with your actual Windows username. Accessing Linux Files from Windows On Windows, files are typically accessed through the File Explorer. WSL makes it easy to get to the Linux file system through the File Explorer. You can actually open the Windows File Explorer directly from your Linux command line. user@machine:~$ explorer.exe . NOTE: it is important add the dot (.) after the command. The dot tells the Windows File Explorer to open the current directory (e.g. your Linux directory) instead of the default directory on Windows. When the File Explorer opens, you’ll notice that your Linux file system is actually mapped to a network device in Windows, most likely: wsl$Ubuntu. If you installed a different Linux distribution, the name might be different. On Windows, the folder for all of your user files (desktop, document, etc.) is located in C:UsersWINDOWSUSERID. On most Linux distributions, your Linux user files are located in /home/LINUXUSERID. Using the File Explorer that is equivalent to wsl$UbuntuhomeLINUXUSERID. Find this folder on your Windows File Explorer and add it to your quick access bar. It will be very helpful to do this for accessing your Linux files later. Installing Software We’ll need to use several Linux applications in this class. Installing software in Linux is pretty easy. Ubuntu provides a command called apt which stands for Application Package Tool. It utilizes a central repository for commands that can be installed directly from the command line. To install the software we’ll need for this course run the following. You’ll have to enter your password to grant apt administrator authority on your Linux installation. sudo apt update sudo apt install gcc make git You’ll be asked to confirm before apt starts downloading and installing the software. In order to install software in a Linux distribution you need to run as the system administrator. Windows has a similar restriction for most software. On Linux, the system administrator user is called root. The sudo command stands for “switch user and do an operation”. There have been several comics and memes around root and sudo. In this case, that operation will be to run apt to install software. The first command updates the local database on your Ubuntu installation so that installed software will be at the most recent versions. The second command installs two (2) applications: gcc and make. • gcc: is the GNU C compiler – we’ll be using that to build programs • make is a build tool that can be used to automate builds for projects. • git is a version control tool, specifically designed for distributed and group development. We will likely add a few more packages as we go. Editing a Text File Using the Windows File Explorer, create a text file in your Linux home directory. Add some text to the file, use your favorite text editor. From the Linux command window, change to your home directory by typing cd without any arguments. You’ll know you’re in your home directory when you see a tilde (~) symbol at your command prompt. Now, display the contents of the file with the cat command on Linux. For example, from the command prompt try this: user@machine:~$ ls text.txt user@machine:~$ cat text.txt this is some text The cat command is short for “concatenate”. It will print out the contents of a file to the command prompt. SUBMISSION REQUIREMENT: Capture a ‘screenshot’ of the command(s) to print out the file. Exploring Linux Now that you have WSL and Ubuntu Linux set up and installed, you will explore your Linux environment. This exploration will involve experimenting with some other commands. There are some interesting and powerful things you can accomplish with relative ease. From your Linux command window and experiment with the commands below. You can get help on commands with the man command (which is short for “manual”). For example, try running man ls. You can search for appropriate commands with the apropos command. For example, try running apropos zip. Some commands you should experiment with (these are all typed via the command line, or terminal): 1. view a man page: man < command > e.g. man ls 2. list a directory: ls 3. list a directory, long: ls -l 4. list a directory, long, all: ls -al 5. search for a file in the file system: find -name .profile 6. Search through a file or a through a program output: grep 7. change directory: cd < dir > 8. change directory up one level: cd .. 9. change to last directory: cd – 10. make directory: mkdir < dir > 11. remove file: rm < file > 12. remove directory: rmdir < dir > 13. copy files: cp < source > < dest > 14. display contents of a file: cat output.txt 15. zip some files into a .zip file: zip lab1 hello.cpp output.txt 16. Create an empty file: touch empty.txt Creating a Development Environment Create a Development Directory Next you will want to create a development directory to hold the projects you will complete. To do this, create a directory (folder) in your Linux home directory for this class. It is recommended to do with within WSL, but Windows Explorer could be used for this purpose. Avoid spaces in all directory and file names. Setting up your Development Environment WINDOWS: I highly recommend that you use VSCode https://code.visualstudio.com/. VSCode is an basic IDE created by Microsoft with lots of configuration and plugin options. VSCode is very WSL aware and can be invoked from the WSL command line via the command code . Through various plugins, VSCode can be configured into a fullyfeatured IDE, but we will not necessarily be doing that in this course. However, you can really use any Windows-based editor that you like. A couple of other options: • CLion – This is a C/C++ editor created by JetBrains and is similar in look and feel to IntelliJ. – You can download CLion from JetBrains web site: https://www.jetbrains.com/clion/ – There is no community version, but you can get a free license if you register with JetBrains as a student: https://www.jetbrains.com/community/education/#students • Notepad++ https://notepad-plus-plus.org/ is a simple text editor with text highlighting and has project file management as well. LINUX: While for this class, it will be easiest to edit files from Windows and then build and execute them from your Linux command line, if you want to experiment with editors on Linux there are many out there. Unfortunately, running graphical Linux applications in WSL doesn’t work (at least not with additional setup). The following examples are editors that can be used via the command line. They have no mouse support, but can be useful for editing text files in a hurry: • nano is a non-graphical text editor. It uses CTRL shortcuts for many operations. It should already be installed on your Ubuntu system. More information on nano can be found here: https://www.nano-editor.org/ • vim is a non-graphical text editor that you might have used if you use Git Bash on Windows. It is installed by default on your WSL Ubuntu installation. It should already be installed on your Ubuntu system. Even if it does not become your primary coding editor, basic vim skills are useful as it is quite universal. More information on vim can be found here: https://www.vim.org/ • Emacs is another non-graphical text editor. It can be installed by running: sudo apt install emacs. More information on Emacs can be found here: https://www.gnu.org/software/emacs/. > What ChatGPT had to say about Emacs: “Ah, Emacs users—the digital equivalent of artisanal coffee brewers who insist on hand-grinding their beans. These are the folks who revel in the arcane, who find joy in a labyrinth of keybindings that would make a concert pianist sweat. They’re not just writing code; they’re crafting an experience, one Ctrl-Alt-Shift command at a time. Sure, they could opt for a sleeker, more modern text editor that doesn’t require a Ph.D. in Emacs Lisp to customize. But where’s the fun in that? Why simply get the job done when you can also get a workout for your fingers and a test for your memory? Emacs users are the type who believe that efficiency is overrated. After all, if you finish your tasks too quickly, what will you do with all that extra time? Probably just waste it learning another obscure Emacs command. Ah, the circle of life.” NOTE: If you wish to enter the long-running debate of ‘emacs’ vs ‘vim’ (https://en.wikipedia.org/wiki/Editor_war feel free to try out one (or both) of those editors. Building and Running a Program In this section you will be compiling, running, and modifying a program. Create a source file (hello.c) and add the following to that file. /********************************** * hello.c * * A hello world program in c ***********************************/ #include int main (int argc, char* argv[]) { // Print a message to the user printf(“Hello world! “);// Return to exit the program return 0; } In order to execute this source code, it must be compiled. Like ‘Java’, compiling a ‘C’ program translates the source code into a format that is executable. One difference between ‘Java’ and ‘C’ code is that ‘C’ is translated directly into an executable and can be run without the need for a virtual machine interpreter. SUBMISSION REQUIREMENT: As you run the commands to compile and run your program, make sure you answer the questions associated with each step. At the command shell, issue the command: gcc hello.c Run the command to perform a long listing (listing with details) of the files that now exist in the directory with hello.c. • What files do you see? Include a ‘screenshot’ of the listing output. • How big is each file? How did you determine the size(s)? • Research the compiler gcc, what file contains the executable generated from compiling hello.c? Now run the generated compiler output. Include a ‘screenshot’ of the output in your submission. Invoking gcc as above will “build” the application from source. Building actually comprises several steps including compilation and linking. Intermediate files are created but removed by default when gcc finishes, leaving only the executable. We can supply options to gcc by adding arguments to the command line. Some options you should try: -S -c -Wall -Wextra -std=c89 Try each of these options and by observation and research determine how each option changes the behavior of gcc. Describe briefly in your submission. By now you should know that the name of the executable gcc builds is a.out by default. Probably not terribly useful. How can you change this behavior? To run the program you will have to type at the prompt, $ ./a.out Deliverables As noted at the beginning of this document, collect various requested screenshots and commentary [in a Word document]. Submit a pdf of that document to the Week 1 Lab Assignment in Canvas. It is intended that you be able to submit this by the end of the lab period.

$25.00 View

[SOLVED] Ai3603 homework 3

You are required to complete this homework individually. Please submit your assignment following the instructions summarized in Section 7.1 Reinforcement Learning in Cliff-walking Environment In this assignment, you will implement Reinforcement Learning agents to find a safe path to the goal in a grid-shaped maze. The agent will learn by trail and error from interactions with the environment and finally acquire a policy to get as high as possible scores in the game. 1.1 Game Description Suppose a 12×4 grid-shaped maze in Fig. 1. The bottom left corner is the starting point, and the bottom right corner is the exit. You can move upward, downward, leftward, and rightward in each step. You will stay in place if you try to move outside the maze. You are asked to reach the goal through the safe region and avoid falling into the cliff. Reaching the exit terminates the current episode, while falling into the cliff gives a reward -100 and return to the starting point. Every step of the agent is given a living cost (-1).Figure 1: The cliff-walking environment The state space and action space are briefly described as follows: State: st is an integer, which represents the current coordinate (x,y) of the agent. Action: at ∈{0,1,2,3}, where four integers represent four moving directions respectively. 1.2 Implement Sarsa, Q-Learning, and Dyna-Q You are asked to implement agents based on Sarsa, Q-Learning, and dyna-Q algorithms. Please implement the agents in agent.py and complete the training process in cliff_walk_sarsa.py, cliff_walk_qlearning.py, and cliff_walk_dyna_q.py respectively. An agent with a random policy is provided in the code. You can learn how to interact with the environment through the demo and then write your own code. Hint: Take cliff_walk_sarsa.py as an example: • Line 27: more parameters need to be utilized to construct the agent, such as learning rate, reward decay γ, ε value, and ε-decay schema. • Line 47: the agent needs to be provided with some experience for learning. Hint: In agent.py: • You need to implement ε-greedy with ε value decay in the choose_action function. • Functions given in the template need to be completed. You can also add other utility functions as you wish in the agent classes. 1.3 Result Visualization and Analysis Result Visualization: You are required to visualize the training process and the final result according to the following requirements: 1. Plot the episode reward during the training process. 2. Plot the ε value during the training process. 3. Visualize the final paths found by the intelligent agents after training. Result Analysis: You are required to analyze the learning process based on the experiment results according to the following requirements: 2. Analyze the training efficiency between model-based RL (dyna-Q) and model-free alorithms (Sarsa or Q-learning). Please describe the difference in the report and analyze the reason in detail. 2 Deep Reinforcement Learning 2.1 Game DescriptionFigure 2: The lunar lander environment in this assignment This task is a classic rocket trajectory optimization problem. As shown in Fig. 2, you are required to control the the space-ship and land between the flags smoothly. In this assignment, you are required to train a DQN agent in “LunarLander-v2” gym environment. The definitions of state and action are given as follows: State(Array): The state st is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not. Action(Integer): There are four discrete actions available: do nothing, fire left orientation engine, fire main engine, and fire right orientation engine. More details of this gym environment is given in the documents of gym . However, information given in this file is sufficient for this assignment. 2.2 Read and Analyze the Deep Q-Network Implementation In this section, a complete DQN implementation dqn.py is given for the lunar lander task. You are required to read the code and understand the DQN training process. You are required to write comments in the code to point out the function or your understanding of each part of the code. Please fill in every “““comments: ””” (cf. dqn.py) with your understanding of the code. 2.3 Train and Tune the Agent In this section, you are required to train the DQN agent. Please show your training process (such as learning curve of episode return) and the training result (such as the game video of the final model) in the report. • (Requested) Tuning the hyper-parameters of the agent, especially gamma value, epsilon value, and epsilon decay schema. • Tuning the structure of the Q network. • Utilize multiple continuous frames of the game instead of one frame each time. • Other ideas. 3 Improve Exploration Schema There exists lots of other exploration strategies except ε-greedy. You are asked to find and learn one new exploration method, such as Upper Confidence Bound(UCB). Summarize the idea, pros, and cons of the new exploration method. Write your understanding in the report. 4 Installation You can follow the tutorial in this section to install the environment on Linux or Windows, and we strongly recommend you to use Linux system. 4.1 Install Anaconda Open the address https://www.anaconda.com/distribution/ and download the installer of Python 3.x version(3.8 recommended) for your system. 4.2 Install Required Environment After installing anaconda, open a Linux terminal and create an environment for Gym: conda create python=3.8 –name gym Then activate the environment conda activate gym Install gym and some dependencies pip install gym==0.25.2 pip install gym[box2d] pip install stable-baselines3==1.2.0 pip install tensorboard Install pytorch: Please follow the instructions given on the pytorch website . 5 Code, Demo Video, and Report Code: You can edit the code between “##### START CODING HERE #####” and “##### END CODING HERE #####”. Please DON’T modify other parts of the code. Demo Video: Videos (optional) should be in .mp4 format and a 10MB max for a single file . You can compress/speed up the videos. We recommend reocrding videos utilizing gym wrappers: “env = gym.wrappers.RecordVideo(env, ’./video’)”. More information is given in the gym docs . All the videos should be put into a folder called videos. Report: Summarize the process and results of the homework. 6 Discussion and Question You are encouraged to discuss your ideas, ask and answer questions about this homework. If you encounter any difficulty with the assignment, try to post your problem on Canvas for help. The classmates and the course staff will try to reply. 7 Submission instructions 1. Zip all your program files, experiment result, and report file HW3_report.pdf to a file named as HW3_ID_name.zip. 2. Upload the file to the homework 3 page on the Canvas.

$25.00 View

[SOLVED] Ai3603 homework 2

Homework 2 You are required to complete this homework individually. Please submit your assignment following the instructions summarized in Section 6. 1 Task Introduction For this homework assignment, you are required to implement the minimax algorithm with Alpha-Beta pruning in a new checkers environment. Your agent should evaluate the current board state and select an action that maximizes its chances of winning against the opponent. Please refer to Section 2 for detailed rules and guidelines.Figure 1: Checkers Board 2 Checkers Rules 2.1 Game Setup • The game is played on a checkers board, as shown in Fig. 1. • Player 1 controls 8 blue pieces and 2 special yellow pieces. • Player 2 controls 8 red pieces and 2 special green pieces. • Player 1 goes first, followed by alternating turns between the players. • The objective is for each player to move all their pieces into the opponent’s starting area: – Player 1 must move the yellow pieces to the green area and the blue pieces to the red area. – Player 2 must move the green pieces to the yellow area and the red pieces to the blue area. 2.2 Movement Rules Each turn, a player can move one piece. Pieces can move in two ways, following standard Chinese Checkers rules: • Move to an adjacent empty space: A piece can move to any of the directly adjacent positions if they are unoccupied. • Jump over a piece: A piece can jump over an adjacent piece (belonging to either player) if the space directly on the opposite side, along the same line, is empty. Multiple jumps are allowed in a single move if conditions allow. 2.3 Special Rules for Extra Turns • When a player successfully moves one of their special pieces (yellow for Player 1, green for Player 2) into the corresponding target area on the opponent’s side for the first time, they are awarded an extra turn. • Each player can earn up to two extra turns throughout the game (one for each of their special pieces reaching the target area). 2.4 Winning Condition • The game ends when one player successfully moves all of their pieces into the opponent’s starting area, with both special pieces occupying their corresponding target zones. The player who achieves this wins the game. • The maximum number of rounds is 200 (excluding additional rounds). If the maximum number of rounds is reached, the player with more pieces in the correct zones wins the game. 3 Code Description We have provided the code for the checkers environment and implemented two agents: the RandomAgent and SimpleGreedyAgent classes, which represent random and greedy strategies, respectively. These agents are implemented in the agent.py file. We have tested the code in Python==3.8. You are required to complete the YourAgent class in the agent.py file to execute the minimax algorithm with Alpha-Beta pruning. You will run runGame.py to execute a match between the two agents (as defined in the callback function). Complete the task according to the following requirements: 1. Implement the YourAgent class to apply the minimax algorithm with Alpha-Beta pruning. 2. Test your agent as the first player (Player 1) and as the second player (Player 2) against the SimpleGreedyAgent. 3. (Optional) Let two YourAgent instances compete against each other. 4. Document the results of these matches in your report. 4 Grading Criteria The grading criteria for this project are as follows: • Code: 70%, which must include basic comments, and only agent.py is allowed to be modified. • Report: 20%, which should include the algorithm implementation process and necessary test results, limited to a maximum of eight pages. • Performance: 10%, where performance evaluation will be determined through a match between YourAgent and an undisclosed Baseline. 5 Discussion and Question You are encouraged to discuss your ideas, ask and answer questions about this homework. If you encounter any difficulty with the assignment, try to post your problem on Canvas for help. The classmates and the course staff will try to reply. 6 Submission Instructions 1. Complete the code and write report in English. 2. Zip all your files to a file named as HW2_ID_name.zip. The file structure should be as follows: HW2_ID_name.zip code/ agent.py other_files report.pdf 3. Upload the file to the homework 2 page on the Canvas.

$25.00 View