rocessingSchool TitleSchool ofArtificial Intelligence andAdvancedCompu pmChinatimeUTCBeijingonMarch2025Final Word Count3000Ifyou agreeto lettheuniversityuseyourwork an lease type Policy The report should be submitted in PDF format. The word count limit of the report is 3,000. All code must be written in Python. It should be well-structured, easy to read, and thoroughly documented. Each function should include comments explaining its purpose and functionality. Ensure that the code runs without errors. All required files should be included in a single zip file. A README file is needed to explain how to run the code and list any dependencies. The document must be submitted through Learning Mall Online to the appropriate drop box. DTS406TC Natural Language Processing Coursework 1 (Group Assessment) Due: 5:00 pm China time (UTC+8 Beijing) on March23,2025 Weight: 40% Maximum score: 100 marks (80 % group marks + 20 % individual marks) Groupings: Each group consists of 2-3 students. You are free to select your own team members. Students who do not make a selection will be randomly assigned to a team . Once the teams are confirmed , no changes will be permitted. Assessed learning outcomes: A Systematically comprehend the theoretical foundations ofNatural Language Processing B Apply statistical and machine learning techniques to process and analyze natural language data Overview Sentiment Analysis is the process of determining whether textual content expresses a positive, neutral, or negative sentiment. With the vast amount of textual data (e.g., tweets, Reddit posts, reviews) generated daily, Sentiment Analysis can automatically identify users' attitudes from User-Generated Content (UGC), assisting companies or organizations in making informed decisions. 1. Literature Review on Sentiment Analysis (20 Marks, Individual Work) a) Overview of the sentiment analysis and its applications. Please provide three examples of real-life applications of sentiment analysis. (6 Marks) b) Please list three key challenges in sentiment analysis. (6 Marks) c) Please elaborate on two traditional methods (e.g., Naive Bayes, SVM) and two deep learning approaches (e.g., BERT, GPT) on the sentiment analysis. Meanwhile, discuss the advantages and disadvantages of each approach. (8 Marks) d) Each team member should individually complete the literature review on sentiment analysis. This section will be scored individually. 2. Data Collection (12 Marks, Group Work ) Collect two datasets of User-Generated Content (UGC) from platforms like Twitter, Reddit, or Weibo, focusing on sentiment analysis in different scenarios. Each dataset should contain a minimum of 3,000 instances. Preprocess the datasets by performing tasks like stopword removal and tokenization. Finally, a statistical analysis of the two datasets should be conducted (e.g., the word distribution of the corpus). Notice that some UGC data may be downloaded from Kaggle if there are API restrictions preventing direct downloads from social platforms. (6 Marks/dataset x 2=12 Marks) 3. Algorithm Description & Implementation (48 Marks, Group Work) a) Choose four approaches for the sentiment analysis task on the collected UGC datasets: two using traditional methods and two employing deep learning methods. All four approaches should be applied to each of the UGC datasets. Please provide the pseudo- code and briefly provide the comments for each function of the pseudo-code. (5 Marks/algorithm x 4= 20 Marks) b) Develop a sentiment analysis system for each approach using Python. The implementation pipeline should include the following components: feature engineering (3 Marks, e.g., converting textual data to embedding space regarding to different approaches), algorithm implementation (3 Marks, with fine-tuning required for the deep learning approach), and metrics computation (1 mark). 7 Marks/algorithm x 4 = 28 Marks) 4. Results Analysis (13 Marks, Group Work) a) Provide the sentiment analysis results for each approach applied to the two UGC datasets. Select and apply three relevant metrics (e.g., precision, recall, and F 1 score) to assess the performance of the implemented models, with each metric worth 3 Marks. (9 Marks) b) Explain the reasons behind the model performance for each approach. (1 Mark/algorithm x 4 = 4 Marks) 5. Report Writing (7 Marks, Group Work) This coursework evaluates your understanding the challenges of the problem and the correctness of the proposed algorithms. It also tests your professional skills in terminology usage, presentation of algorithms and experimental results, as well as the logical manner of the proposal. (7 Marks) Submission One of the team members must submit a single zip file. The zip file is named "TeamID_Coursework.zip". It includes: a cover letter with the group member information and the final PDF reports. The final PDF reports includes the individual reports of literature review on the sentiment analysis and the group report. A folder labeled "algorithms" contains all the model implementations, data preprocessing scripts, and evaluation scripts. A folder labeled "data" contains all the datasets and the experimental results in the CSV format.
STAT0045. In-course Assessment 2 (2024/25 Session) Department of Statistical Science General Instructions · This assessment is classified as Coursework as defined in the UCL Student Regulations for Exams and Assessments. It contributes 40% to the overall mark for this module. · The release date for this assessment is 12:00 (UK time) on Tuesday, 11 March 2025. · The submission deadline is 16:00 (UK time) on Tuesday, 18 March 2025. · Individual extensions to the submission deadline can only be granted where a student has been issued with a Summary of Reasonable Adjustments (SoRA), has used a Delayed Assessment Permit (if the assessment is eligible), or has made a valid claim for Extenuating Circumstances. The standard extension length for this assessment type is five working days. 。 If you have a SoRA, your extension should be setup automatically and you should see it reflected in the deadline displayed in the submission portal. If you think that your SoRA adjustment has not been applied, please contact the module lead at the earliest opportunity. 。 Delayed Assessment Permits and Extenuating Circumstances claims should be submitted through Portico. The module lead will be notified and will act on extensions approved via these routes, but the deadline displayed in the submission portal will not update instantly. · In preparation for this assessment, please ensure that you are familiar with the Department of Statistical Science’s guidance on academic integrity. When submitting your work, you will be required to make a declaration that you have read and understood this guidance. · Parts of your submission may be scanned using similarity detection software. If any breach of the assessment regulations is suspected, it will be investigated in accordance with UCL’s Student Academic Misconduct Procedure. · To facilitate anonymous marking, you should not write your name anywhere on your work, including in file names or file descriptions requested as part of the submission process. · You must only submit your work via the designated portal in Moodle. If you try to submit via email or any other channel this will not count as a submission and will not be marked. · There are strict, non-negotiable penalties for late submission, which for coursework are as follows. 。 Up to 2 working days late: deduction of 10 percentage points, but no lower than the pass mark. 。 2-5 working days late: capped at the pass mark. 。 More than 5 working days late: mark of 1.00%. · If the module lead becomes aware of a significant technical issue or outage affecting Moodle during the assessment, a message will be circulated to explain what has happened and the steps being taken to mitigate the issue. If you do not receive notification of a more widespread issue and you experience technical difficulties, you should refer to the Help & Support resources provided by UCL’s central IT service. However, last-minute technical issues will not be considered as valid grounds for missing the deadline, so ensure that you leave plenty of time to prepare, upload and check your submission. · Non-submission (in the absence of any valid Extenuating Circumstances) will mean that your mark for this component is recorded as 0.00% and you will be deemed to have made an attempt. · You should expect to receive feedback on this assessment within 20 working days of the submission deadline. In the event of a delay, the module lead will contact students directly with details of the revised timeline. · ©2025 UCL. This assessment paper is the intellectual property of UCL and subject to copyright. It must not be reproduced or shared with any third party without prior permission of UCL. The assessment · This is an individual assessment; you must work alone. · This assessment consists of two parts. For Part A, you can submit scanned/photographed hand- written solutions. Make sure that scanned work can be read clearly. Note the UCL advice on submitting scanned/photographed work (link(https://www.ucl.ac.uk/news/2020/apr/seven-simple- steps-submit-handwritten-answers-moodle-exams-or-assessments)). For Part B you are required to write a report and this report should be typed. Include a word count for this part. · The relevant course material for this assessment is all the material up to and including Section 4.6. Exercise Sheet 7 is not included. · Keep your answers concise. Answers that are unnecessarily elaborate or include information that is not asked for will be penalised. · Part A and Part B are both marked on a scale 0-100, and are equally weighted for the final mark. For Part A, marks for the constituent parts are listed in bold face. Marks are given for correct answers, but also for succinctness and clarity of explanation. · To ensure anonymous marking, only provide your Student ID number at the top of Part A and B (and not your name). Part A and B should be submitted together in one PDF file. Submit the file with your Student ID as name; for example, if your ID is 20001234, use the name 20001234.pdf . · You can use R for the questions in Part A, but do not hand in R code. R code is information that is not asked for; see above. · For Part B, you are allowed to use an AI tool (such as ChatGTP), but you should acknowledge the use of this and explain the way you used it. · You can use the forum to raise queries during the assessment, but only if the queries concern clarification of tasks in the assessment. This option will only be available till 12 noon on March 17th, 2025. From that time onward the forum will be read-only till March 19th. Part A Question 1 For this question, you have to download a data set that is identified by your Student ID number. · You can find the data in the Section ICA 2 on Moodle. Your data set is identified by your student ID number. Be careful to identify your data specifically. Marking is partly based on student-specific data analysis. · If your ID is 20001234 for example, then select and download the text file 20001234.txt and put the file in the working directory of your R session. · Read in the file in your R session by the command dta dta head(dta) y x 1 8.27 1 2 5.06 1 3 12.14 1 4 4.92 1 5 6.30 1 6 9.81 1 · If you cannot read in the data in R, contact the module lead as soon as possible via email: [email protected] . · The data are created in the format of a 100 × 2 table. The first column ( y ) is for response Y, and the second column ( x ) identifies the level of the treatment variable. Your data concern a one-way ANOVA experiment regarding the height of a flower plant. Response Y is the response in centimeters, and x identifies the five treatment levels. Values x = 1, 2, 3, and 4 correspond with the use of four different fertilisers. The value x = 5 corresponds with the no-fertiliser treatment. The aim of the experiment is to establish how the height of the plant is affected by choice regarding fertilisers. For the statistical inference, use a significance level of 5%. (a) Consider the hypothetical case where data for this experiment are collected by someone who uses her garden. Say she collected the data by using the front and the back garden as follows: plants with fertiliser x = 1 and x = 2 in the front garden, and the plants with the other treatment levels in the back garden. Explain briefly and in simple terms (without using the word “randomisation”) why this is not a good way to collect the data. [3] (b) Potassium is a common ingredient in fertilers. Consider the hypothetical case that fertiliser x = 1 has twice the amount of potassium compared to fertilisers identified by x = 2, 3, and 4. Would this undermine your statistical inference? Explain your answer. [3] (c) Define a one-way linear ANOVA model for response Y with an intercept. Define the model such that the intercept can be estimated by the mean of the observed values for Y under the no-fertiliser treatment. Write down the model equation and specify this equation completely for your data. [8] (d) Fit the model in (c) to your data and report the ANOVA table with clearly defined rows and columns. Using the model definition in (c), define the hypothesis for testing whether all five treatment group means are equal. Test this hypothesis using the ANOVA table. Be explicit about the distribution you use for this test. [4] (e) Provide the point estimates for all the model parameters in (c). [6] (f) Define the estimator of the intercept in the model in (c) as a function of response mean(s) and derive the variance of this estimator as a function of the error variance σ 2 and the sample sizes for the treatment levels. Clearly explain your derivations. [8] Question 2 (a) Show how to derive in Section 2.6.3.4. Do not derive anything that is already shown in the lecture notes; solve for a using the equations that are available in the notes. Mind that a previous version of the lecture notes contained a typo in the expression for a. [6] Consider stratified sampling with the following specifications: Stratum Stratum size Strata variance 1 2000 4 2 5000 4 3 3000 16 4 3000 16 5 2000 64 (b) Consider that the variance of the stratified sample mean Y-ST is fixed at 1/10. Assume that costs are defined by where c0 = 100, and (c1, c2, c3, c4, c5 ) = (1, 2, 1, 2, 4). Derive the optimal strata sample sizes nℓ , for ℓ = 1, 2, 3, 4, 5. Explain your derivation. Hint: you may want to use R for the computation, but do not include R code in your answer. [14] Question 3 Consider the following randomised response (RR) design for a yes-or-no question that asks respondents whether or not they have committed fraud. Instead of answering the question directly, the respondent throws a dice and keeps the outcome of the throw hidden from the interviewer. · If the outcome of the throw is 1 or 2, then the respondent answers yes. · If the outcome of the throw is 6, then the respondent answers no. · If the outcome of the throw is 3, 4, or 5, then respondent answers yes or no in line with whether or not he or she committed fraud. Assume that respondents follow the RR design. Let π 1 denote the probability that a respondent has committed fraud. (a) For this RR design, give the values of the conditional probabilities P(observed yes|latent yes) and P(observed no|latent no), where observed refers to the data collected, and latent refers to the unknown status regarding fraud. [7] (b) The observed data are given by 300 yes-answers, and 500 no-answers. Estimate π 1 and calculate the standard error for this estimate. Explain your answers. [8] Consider the RR design by Warner as specified on Slide 109. Define Yi = 1 when respondent i used illegal drugs, and Yi = 0 otherwise. Say there are two non-overlapping groups of respondents in the RR survey; Group A and Group B. Consider the logistic regression model where xi = 0 when respondent i belongs to Group A, and xi = 1 when i belongs to Group B. Using ˆ the RR design, assume that the probability of observing a yes-response in Group A is estimated by λA . (c) Define an estimate of β0 as a function of A . Explain your derivation. [10] Question 4 (a) Consider Theorem 2.3 in Chapter 2 of the lecture notes. Using the notation in Section 2.8.2 provide the final details of the proof; that is, show that is indeed an unbiased estimator of Var(Y-CL). Provide the details of your derivation. Do not explain the existing equations in the proof of Theorem 2.3 in the lecture notes, but be clear which of the equations you use in your derivation. [9] There are ten schools in a particular area. As part of an investigation into teaching standards, an inspection team proposes to visit three of the schools and administer a test to all of the 14-year old students in each school visited. The school sizes (in hundreds of pupils) are as follows: School Size 1 22 2 18 3 17 4 21 5 11 6 23 7 16 8 22 9 26 10 24 (b) Three pseudo-random numbers, distributed uniformly on (0, 1) , have been obtained using R. They are 0.821, 0.228 and 0.307. Use these to select a PPS sample of three schools, explaining your procedure clearly. [8] (c) Suppose that schools 4, 7 and 2 were selected (note that these are not necessarily the schools that would be chosen using the random numbers provided above), and that the average test results for these three schools were 14.5, 16.7 and 13.6 respectively. Use these data to estimate the average test result across all ten schools. Provide an estimated standard error for your estimate. [6] Part B For this part you are required to write a short report discussing aspects of data ethics for a given scenario. The scenario: In the UK, housing benefit can help you pay your rent if you are unemployed or on a low income. If you receive benefit, then you need to report a change of circumstances for you and anyone else in your house. Examples of housing benefit fraud are not reporting all income or not reporting a change of income. In a large city in the UK, the manager who deals with housing benefit in the city wants to use data science to help identity fraud. The manager’s idea is to use data from past receivers of housing benefit who were investigated for fraud. Assume that for these people individual information is available on whether or not fraud was detected. The manager envisages using the data to define a statistical prediction model and next to use this model to identify current receivers of housing benefit who are likely to commit fraud. You are asked to lead this project. The main statistical parts of the project are: collecting relevant data, data analysis, defining a model that can be used for prediction, and using the model to make a prediction for current receivers of housing benefit. Assume that the chosen prediction model is a logistic regression model for a binary response variable with value 1 for fraud and value 0 otherwise. Instructions and guidelines for the report: · Write a report that discusses the scenario with a focus on data ethics. Limit the scope of data ethics to the material that is discussed in STAT0045. · You should explicitly use the following terms in the report (and reflect on the concepts attached to these terms): “data subject”, “model subject”, “fairness”, and “transparency”. · You should discuss to some extent the importance of GDPR in this project and give at least one concrete example of a measure that you would implement to warrant that GDPR guidelines are followed. In the discussion of GDPR you should explicitly use the term “personal data” in the report. · Assume the reader knows the logistic regression model; do not discuss standard aspects; for example, how the model is defined or how to estimate model parameters. · Give the report a title. · Type the report in a text editor and add the word count at the end of the report. Use font size 12. · Write the report in paragraphs and complete sentences. Using a few bullet points is OK, but do not write the report as a list of bullet points. · Maximum word count for the report (including the title) is 700 words. Report longer than 700 words will be penalised. · If you use an AI tool (see instructions), then use an appendix to acknowledge this use. This appendix does not count towards the maximum word count. · You can add literature references to the report. References do not count towards the maximum word count. No need to add references to the STAT0045 course material. Hints: · There is no specific need to use AI tools for this report. Mind the danger of using AI tools; see the slides on Use ofAI Tools in Chapter 1. · Although it is fine to refer to literature beyond the course material, there is no specific need to do so. · The aim of this assignment is to see whether you are able to critically reflect on aspects of data ethics in a practical scenario. Do not just enumerate definitions or aspects of data ethics, focus instead on some of the aspects and explain why they are important in this scenario. · You are not asked to solve potential problems in this scenario, or provide details of specific actions. The report should focus on potential issues with respect to data ethics - not knowing how the issues can be addressed in detail is OK. Marking criteria: adherence to the above instructions and guidelines, and the quality of the presentation (readability, structure, language). [100]
INFT2051 – Mobile Application Development Final Project :- 1 person(Individual)-2 people (Pair)-3 people (Trio)Howeverifyouworkasapairortriotheexpectationofthelevelof complexityandpolishwillbeincreasedduetothenumberofpeople - lutionSummaryReport(10 Instructions Congratulations! The presentation you gave earlier of your draft concept for mobile technology impressed the directors ofthe technology company, and they are eager to hear more! They have invited you back but this time they are expecting to see a working prototype. You will submit to them a mobile App using the techniques taught in this course (written in C# and MAUI using Visual Studio). In addition to your complied and runnable application you will also provide the directors with the full code behind your project and a document with further information. You will provide a 1000 word written summary of your project. This requires a lot of care: you have a maximum of only 1000 words and you will quickly run out of space if you do not write concisely. This will be submitted as a PDF document. The pdf document mentioned above will include: • Your student name(s) and number(s) • Title of your project • Explanation of the purpose of your project, what it does, how it works, what real world problem it solves, all in 1000 words. Any text over 1000 words will be ignored. • Your design documents including storey boards, data management, etc. • Feature set included (sensors, hardware, software, etc) and why they were used. • What you planned to do, how you would have done it, and why it was not completed. • The approximate percentage contribution of each member, along with a statement to say that each team member has agreed to this percentage. • An individual signed cover sheet, or a team cover sheet signed by all members of the team and then scanned. • Any references for where code may have been sourced from. Pay attention to your user interface. It should be intuitive and easy to use. Test the operation of your project; if it isn’t all working, just show the parts that work. Especially if you think a program might crash, avoid showing the feature that makes it do so. You will receive marks for correct operation of the program, and for task complexity: the more complex your task is, the more marks you will receive. Pay attention to the readability of your code. Do not use variable names such as x1 or j unless they are clearly informative in the context of your code. Use classes, methods, and functions where appropriate to separate your code into logical parts. Include informative comments. Approaches that are not acceptable include: • A programming project that does not use the methods and techniques discussed in this course. • Any material or software that has been submitted for assessment for another course. • Any material prepared by another person/team, unless you clearly indicate which is your own work. • A presentation that fails to show the project working. Teamwork This may be an individual or team assessment, as advised by the course coordinator. Individual marks will be the same as overall marks unless there is an obvious mismatch in contribution (such as a member not clearly contributing to the development of the application) but evidence of this lack of contribution should be clearly provable (e.g., screenshots from your chat, lack of contributions in GitHub etc.). How to submit your assignment The files must be submitted to Canvas by 23:59, Sunday of Week 12. Upload a single Zip/RAR Folder that contains your complete clean solution and report using the following naming convention for the folder (basically all group members Student Numbers): 34567
DMS2030 Individual Assignment 1 Due Date: 11:59 pm, February 23rd, 2025 1. A fertilizer company from San Diego has provided the following data. Last Year ($) This Year ($) Sales 23,000 34,000 Labor 10,000 16,000 Raw Material 8,000 10,000 Capital Equipment 2,000 4,000 (a) Compute the labor, raw material, and total productivity for each year. (5 points) (b) Compute the overall labor, raw material, and total productivity for the two years. (5 points) 2. The inventory of a supermarket turns 46 times a year. The manager of the supermarket says that everything that the store receives from vendors is, on average, sold in two weeks. Are these statements consistent? Assume there are 52 weeks in a year. (5 points) 3. A truck rental company serves 30 rental requests per week on average. Each truck on average is rented for 3 weeks. The rental fee is $1,000 per truck per week. How many trucks are on rent on average? What is the average weekly revenue? (5 points) 4. A hospital emergency room (ER) is currently organized so that all patients register through an initial check-in process. At his or her turn, each patient is seen by a doctor and then exits the process, either with a prescription or with admission to the hospital. Currently, 55 people per hour arrive at the ER, 10% of who are admitted to the hospital. On average, 7 people are waiting to be registered and 34 are registered and waiting to see a doctor. The registration process takes, on average, 2 minutes per patient. Among patients who receive prescriptions, average time spent with a doctor is 5 minutes. Among those admitted to the hospital, average time is 30 minutes. On average, how long does a patient spend in the ER? On average, how many patients are being examined by doctors? On average, how many patients are there in the ER? Assume the process to be stable; that is, average inflow rate equals average outflow rate. Please refer to the diagram next page. (a) On average, how long does a patient spend in the ER? (5 points) (b) On average, how many patients are being examined by doctors? (5 points) (c) On average, how many patients are there in the ER? (5 points) Note: In the last stage (seeing a doctor), there are two types of patients, i.e. two types of flows. When you consider the average time in the last stage, you should take the weighted average because these two types of patients have different measures. 5. Elaine owns a computer rental store. The rental process can be depicted by the following process flow diagram. Customer arrivals to the store with an average rate of R = 20 per week and each customer requests for one computer. The rental time is 3 weeks on average. The rental fee is $300 per week. At the end of the rental period, the customer returns his/her rented computer to the store. Elaine will inspect every returned computer. The inspection time is negligible. On average 10% of the returned computers need to be repaired by a technician (repair actually means re-configuration and minor repair). A computer that needs repair spends an average time of 1 week in the repair stage, including waiting and repair. After repair work is completed, the computer will be put into the stock for rental. Returned computers that don’t need repair will be put immediately into the stock for rental. To ensure that there are always computers available to satisfy customer requests, Elaine has to keep an average of 10 computers in the stock for rental. (a) What is the total number of computers that Elaine needs for her rental business? (5 points) (b) With an average of 10 computers in the stock for rental, on average how much time does a computer spend in the stock for rental before it is rented by a customer? (5 points) (c) Elaine estimates that the average rental time will become 2 weeks if she increases the rental fee to $500 per week. The customer arrival rate, the repair time and the percentage of returned computers that need repair will remain the same. Again, she needs to have an average of 10 computers in the stock for rental. Should she increase the rental fee? Justify your answer by considering the number of computers required and the expected revenue per week. (5 points) 6. Bandai has a production line which produces Doraemon toys. There are three steps: in the molding step, workers transfer plastic into the white figures; in the painting step, worker transfer the white figures into painted figures; in the packing step, workers transfer the painted figures into final boxed toys. Currently, the number of workers and the time needed for each step is listed as follows: Step Number of Workers Time Needed (per Unit per Worker) Molding 5 5 min Painting 8 10 min Packing 4 2 min (a) Draw a flowchart to depict the toy production process. (5 points) (b) What is the bottleneck of the process? (5 points) (c) Suppose the demand for toys is 60 units per hour. What is the flow rate of the process? What is the capacity utilization for each step? What is the implied utilization for each step? (5 points) (d) Bandai plans to hire 17 new workers for the Doraemon production line. How should Bandai train and allocate these workers to different steps in order to maximize the capability of the production line? What if there are 34 new workers? (5 points) 7. Vernon, Andy, and Sean are getting ready for their last period in the exchange study in CBS. Following the final exams, they intend to throw a big party with many of their friends from back home. Presently, they have identified the following set of activities that need to be completed. They decide to not spend any work on preparing the party until all final exams are over. Moreover, they aim to spend a 3-day beach vacation as early as possible, but not before all party planning activities are completed. On June 10, they will enter the final exam week, which will take 5 days. They then want to arrange for live music (which will take 5 days), evaluate a number of potential party sites (6 days), and prepare a guest list, which includes inviting their friends and receiving the RSVPs (7 days). They want to visit their two most promising party sites, which they expect to take 4 days. However, this can only be done once they have completed the list of party sites. Once they have finished the guest list and received the RSVPs, they want to book hotel rooms for their friends and created a customized T-shirt with their names on it as well as the name of the guest. Hotel room reservation (3 days) and T-shirt creation (6 days) are independent from each other, but both of them require the guest list to be completed. Once they have picked the party site, they want to have a meeting on site with an event planner, which they expect to take 4 days. And then, once all work is completed, they plan to take off to the beach. (a) Draw a table to list all the activities, their immediate predecessors, and their durations. Draw the network based on the table. (5 points) (b) What is the critical path? (5 points) (c) For each activity, compute the slack time. (5 points) 8. The following table represents a project that should be scheduled using CPM: Time (Days) Activity Immediate Predecessors Optimistic Most Likely Pessimistic A - 1 3 5 B - 1 2 3 C A 1 2 3 D A 2 3 4 E B 3 4 11 F C, D 3 4 5 G D, E 1 4 6 H F, G 2 4 5 (a) Draw the network. (5 points) (b) What is the critical path? (5 points) (c) What is the probability of completing this project within 16 days? (5 points) 9. Any suggestions/comments for the class so far? (Optional)
Genetic Epidemiology Assessment Data files In the folder for the Assessment, you will find binary format PLINK files (i.e. a .bed file, a .bim file and a .fam file) for a genetic dataset used previously to explore genome-wide associations of autosomal variants with risk of coronary artery disease. This phenotype is mostly comprised of cases with myocardial infarction (more commonly known as ‘heart attack’) or who died of coronary artery disease. Unmatched controls to compare against the cases were recruited at each recruitment centre from the general population (e.g. via General Practice registers). The participants were recruited from across 8 European countries (Denmark, Germany, Greece, Italy, the Netherlands, Spain, Sweden and the United Kingdom). You will also find a separate file (‘assignment_covar.txt’). Within this file you will find the first 2 multi-dimensional scaling (MDS) vectors of genetic ancestry (‘mds1’, ‘mds2’), baseline age (‘age’) and participant sex (‘sex’, male==1, female==2). A variable called ‘pheno’ represents the phenotype of interest, in this case whether a participant has had a coronary heart disease event (pheno==2) or not (pheno==1). (NB: this same information is also encoded in the phenotype column in the .fam file). Finally, you have also been provided with a file (‘mds_variables.txt’) that contains the first 2 MDS vectors of genetic ancestry for all the study participants, as well as for samples from the 1000 Genomes project from a few of the different ethnicities: · CEU = Northern European ancestry samples from Utah in the United States, representing European ancestries; · CHB = Han Chinese samples from Beijing, representing East Asian ancestries; · JPT = Japanese samples from Tokyo, representing East Asian ancestries; · TSI = European samples from Toscana in Italy, representing European ancestries; · YRI = Nigerian samples, representing African ancestries. The file also includes information about which of the 8 European countries each of the study participants come from. Instructions Using what you have learned during the Genetic Epidemiology module, use this dataset to conduct a genome-wide association study (GWAS) to see what these data can tell you about the association between autosomal SNPs and coronary artery disease. As discussed during the module, it is not informative about biology or disease to merely conclude that a genetic variant has strong evidence of association with a phenotype, so consider which of the several ‘post-GWAS’ approaches you have learned about in the module that you could employ to build on your statistical findings to provide further biological insights. Write up your findings in a technical report of no more than 2000 words (excluding any Tables, Figures, legends, appendices or references). The report should include details of the methods you have used and the rationale for using those methods, the results you have generated (including any relevant tables or figures to display information), and a brief discussion that summarises your findings, puts your findings in context of previous literature and considers strengths and limitations of what you have done. You may wish to refer to the module marking rubric below. The choice of how to structure your report is up to you, but you may wish to consider subheadings such as “Pre-GWAS Quality Control, Association Testing, Post-GWAS Analysis, Discussion” or more traditional subheadings like “Methods, Results, Discussion”. Note that you are not expected to include an Introduction section as the aim of the exercise is to test your knowledge and understanding from the Module, not to spend time extensively reviewing the background of the phenotype or previous studies. When you refer to published literature, please list the references at the end of the report. Even though you have variants from all 22 autosomal chromosomes, it should take no more than a few minutes to run PLINK v1.9 commands on this dataset on your computer as you have only been provided with directly genotyped variants (not imputed variants). However, bear in mind that the input files provided and the results files you produce will be considerably larger than those used in the GWAS practical sessions, so you may have to adapt accordingly. If PLINK is giving you strange errors (e.g. segmentation fault errors) or refusing to run, please (a) consult with a classmate in case they have already overcome this issue, then (b) email me with details of the error. This assignment is not supposed to be testing your understanding of specific computing issues so don’t spend hours unable to make progress due to computing problems! Some additional guidance for markers ● The rubric is for guidance on expectations – it is not a prescriptive template. Use your judgement and experience, and interpret flexibly. ● Remember that the students have only spent 4 days on the Genetic Epidemiology module, which was completely new to many of them, so judge accordingly. For example, students have learned about the principles of data filtering for quality control purposes, but have not spent much time considering how to determine what exact filtering thresholds to use. ● We intentionally have not provided guidance on how the different sections should be weighted. Think about the quality of each section, then give an overall mark based on these and your overall impression. While this assessment includes considerable elements of computational work, it is primarily intended to test their understanding of the principles and practices of Genetic Epidemiology have, so the most important aspect is that they have used appropriate methods and justified why they are using them, interpreted their results appropriately etc. ● The words used in the rubric will be interpreted differently by different people. Talking with colleagues could help you gain clarity. Appreciation of inherent uncertainty is one of the course learning outcomes. Remember, not every aspect of the report needs to be at Distinction level to merit an overall Distinction grade. ● Remind yourself of the instructions given to students above. ● Provide constructive feedback. Give your overall impression, and provide specific, actionable feedback, ideally that students can apply in subsequent work. Including specific examples from the assignment to illustrate points is helpful. ● Student submissions are blinded, so please do not try to determine the identity of the student whose assignment you are marking and please do bear the principles of blind marking in mind: make every effort to mark based only on the merits of the submission itself, using the criteria in the rubric provided, rather than anything you might know about the student, their supervisor, or any previous feedback you may have given. Marks and feedback form Student name: Marker 1 feedback comments: Marker 2 feedback comments: Quality control GWAS analysis Post-GWAS aspects Tables & Figures Discussion Overall impression Up to three key things done well Up to three key aspects that could have been better Mark (provisional, then replaced with agreed final mark) Rationale for final mark Any other useful feedback for student Genetic Epidemiology Module Assessment marking rubric Refer Pass High Pass Distinction Quality control Little attempt to ensure data cleanliness or mitigate potential biases. Use of methods poorly justified. Reasonable efforts made to ensure data cleanliness and mitigate potential biases. Use of methods mostly well justified. Good efforts made to ensure data cleanliness and mitigate potential biases. Use of methods well justified. Considerable efforts made to ensure data cleanliness and mitigate potential biases. Clear justification for all methods. GWAS analysis Analysis poorly implemented, inadequately described and not appropriately interpreted. Analytical steps well implemented but only partially described and interpreted. Analytical steps clearly described, appropriately implemented and interpreted. Thorough analyses, clearly described and well justified, with thoughtful and appropriate interpretation. Post-GWAS aspects Little effort to follow up beyond the GWAS analysis. Reasonable efforts to follow up findings from GWAS analysis, but only partially described, justified or interpreted. Good efforts to follow up findings from GWAS analysis, generally well described, justified and interpreted. Extensive efforts to follow up findings from GWAS analysis, well described, clearly justified and appropriately interpreted. Tables and Figures Limited use of any Tables & Figures. Appropriate use of Tables & Figures but not always clearly labelled or easy to interpret. Tables & Figures are mostly appropriately used and are generally well labelled and clear to interpret. Very clear, well-labelled, appropriate Tables & Figures that are easy to interpret. Discussion Little attempt to interpret findings and to put results into context of previous studies. References absent or largely incomplete. Partial interpretation with some attempt to put findings into context of previous studies. References mostly relevant and using standard format. Good interpretation of findings with clear background context. References all relevant using standard format. Excellent interpretation with findings clearly put into context of previous studies. References all relevant sources, consistently using standard format. Overall impression Poor analysis and report, lack of clarity in writing, doesn’t show clear understanding of the topic. Good analysis and report, mostly clearly written and demonstrates reasonable understanding. Very good analysis and report, clearly written and demonstrates very good understanding. Excellent analysis and report, very clear and thorough demonstrates excellent understanding.
IMM250 Science & Society Term Paper Topic and Guidelines (2024-2025) Controlling the Cholera Crisis: A Global Challenge The World Health Organisation (WHO) has identified the current upsurge in Cholera outbreaks from 2021 to date as an area of concern for public health. Despite the successes in controlling cholera over the last few decades, the disease has re-emerged in endemic and non-endemic regions where cases have not been seen for years. Cholera outbreaks can run rampant in conditions of poor sanitation, where the only available option for drinking water is likely unsafe for consumption. These conditions are common in times of conflict and poverty and can be compounded by the effects of urbanization, climate change and extreme climate events from droughts to floods, and in the aftermath of major storms. Cholera is caused by Vibrio cholerae, a gram-negative bacterium that is found in stagnant water bodies or reservoirs where the source is contaminated with either animal or human fecal waste. Infection occurs when this water or cross-contaminated food is consumed and can result in mild to severe disease or even be asymptomatic. For patients with symptoms, there are a range of antibiotics that can be used for treatment but resistance to many if not all these antibiotics is another cause for concern worldwide. According to the WHO, the global risk for cholera is very high and the need for action is critical; they have set an ambitious target date of 2030 for a 90% reduction in deaths and cholera elimination in high-risk countries. Is this achievable? Prompts: In your Science and Society Paper, consider how V. cholerae causes disease and how it spreads. What are the clinical manifestations of cholera, and how is it diagnosed? How does our immune system combat this bacterial infection? How do we test for and treat cholera? Are effective vaccines available or are they still in the developmental pipeline? Make sure to go into sufficient detail and use a variety of resources (both academic and non-academic) to support your work. You might also want to think about access to treatment in areas of poverty and conflict, where only limited access to well-resourced testing and treatment facilities may be available. What factors could pose additional challenges to disease control, and how can some of these obstacles be overcome? How have the effects of extreme weather events and climate change aided the current upsurge in cholera outbreaks across the globe? What does this mean in terms of the burden to public health? Use specific examples and give an in-depth overview based on your chosen narrative and style (see below). Use of generative AI tools: The work you submit for this assignment must be your own and may not include any content from generative artificial intelligence (AI) tools, either verbatim or with edits. You may, however, use generative AI to support your work to create an outline for the assignment, but the final submitted assignment must be original work that is correctly referenced and produced by you alone. Any use of generative AI must be documented in an appendix for your assignment. The documentation should include what tool(s) was/were used, how they were used, and how the ideas generated by the AI were incorporated into the submitted work. Please note that any uses of generative AI beyond the above is NOT permitted and will be considered unauthorized aid, which is an academic offence. Submissions will be assessed at the discretion of the course coordinator, and students will be asked to show evidence of their work if a case of Academic Integrity and the inappropriate use of generative AI tools is suspected. If you are unsure about whether using generative AI tools is right for your work, please discuss this with Dr. Amith first. Format: 1,800 words +/- 10%; double-spaced; 1-inch margins; 12 pt Times. Your paper can refer to any assigned course content (required/recommended readings (e.g., papers, textbook), references provided on slides), or additional references that you find on your own. Please do NOT reference lecture slides. Assignment Content: Your assignment can be approached in several ways, including (but not limited to) the above prompts where you can focus on: 1. Pathogenesis: How does V. cholerae cause cholera? Focusing on this aspect should include a detailed discussion of anti-bacterial immune responses in humans, specific to cholera, on a cellular and molecular level in asymptomatic to severe disease, as the disease progresses. 2. Transmission: How does poor sanitation and lack of access to safe drinking water contribute to the spread of disease? What environmental or educational measures need to be put into place to control the spread of cholera and prevent outbreaks? Is transmission in endemic versus non-endemic countries different? 3. Epidemiological factors: What epidemiological factors determine occurrence rates, distribution, and spread of cholera in endemic compared to non-endemic regions? Do historical trends match what we are currently seeing? Discuss the factors that might be contributing to changes in the epidemiology of cholera specifically compared to waterborne bacterial diarrhoeal diseases in general. 4. Treatment and prevention: What are the effects of antibiotic resistance in the treatment of cholera worldwide? Are there risk factors that might make certain people more susceptible to infection? You can also consider how immune responses might differ in healthy versus vulnerable or immunocompromised populations, and what that might mean in terms of treatment. Are vaccines available and how inclusive is access to vaccines to prevent potential future outbreaks? 5. A ‘holistic approach’: where you can discuss any combination of 2 to 3 of the above points and explore key points in some depth, as per word limits. Your paper will be assessed using the criteria detailed in the rubric on the next page. For the Scientific Content (worth 50%) of your Science & Society term paper, consider the above prompts use them as a starting point for your own work. You do not have to discuss every single prompt listed – just discuss those points that fit best into the story you are telling. Remember that your target readership is a non-scientific audience, so you will need to explain any complex scientific terms and concepts in a way that anyone (regardless of their scientific background or expertise) can understand. You can choose any Tone and Style (worth 20%) for your work; it can be a more traditional academic style, or you can set an informal non-academic tone, as long as it fits with your overall narrative. For example, your work can be in the form of a descriptive essay, an investigative report (like a news article), a case study, a blog, a podcast transcript, or an interview with a scientist, clinician, or patient. You also have the option of including figures or infographics in your work – you may want to use this opportunity to create something original based on your readings. Make the most of this creative freedom and give us an original take on Controlling the Cholera Crisis. (Originality and Creativity, worth 10%) Use of Sources and Accuracy of Information (worth 20%) Make sure that the resources you use are well-balanced between academic and non-academic works. Sources must be cited in-text, and included in a list of references at the end in Vancouver reference format. Note that your work will be submitted to the plagiarism detection tool Turnitin to check for similarity to other published works; a similarity score of 25% or less (including your references) is considered acceptable.
Principles of Health Data Analytics Assignment Deadline for submission: April 30th 2025, 16:00 Provisional mark and feedback release date: May 31st 2024 Length: 2,500 words (excluding bibliographies, tables, appendices, pictures and graphs and associated captions). Aim: In this assignment you are provided with a ‘Case Study’ comprising a short description of a problem faced by a walk-in clinic and a set of data about the clinic. You are asked to compare two proposed changes to the clinic staffing and assess the likely impact of each on specific operational outcomes. You are provided with a software tool that you can use, along with the synthetic data, both downloadable from the Moodle page, and the information provided in this briefing note, to model the clinic and the impact of changes to aspects of the clinic. Output from the model should allow the proposed changes to be assessed. Your submission should be in the form of a report presenting your results to key decision-makers who is not a data analyst, including appropriate recommendations that follow from the data analysis and an explanation of any assumptions and limitations of the work. The assignment aims to assess your ability to: structure a problem, identify a model and assess its validity and reasonableness, and to analyse and present data and model output in a way tailored to inform decisions made by managers who are not specialists in data analysis. Case Study: The Borchester Urgent Care Centre A local commissioning group is reviewing the operation of a “24 hr walk-in” clinic for minor injuries / GP services. Managers have concerns about the way the clinic functions given the large fluctuations in demand: big queues can build up and sometimes people leave before being seen, likely attending the AE of the local hospital, while at other times doctors are left sitting idle. The commissioning group’s stated aim is to reduce queues but without too much clinician idle time. The two options identified are: 1) to add another member of staff to the team assessing and treating patients 2) to add a member of staff who would triage patients, so that some patients can be discharged more quickly. You are asked to assess the impact of these on waiting times and staff utilisation in the clinic. Your model will need to include the following variables, which can be obtained from the data and information provided: i) number of available staff ii) mean rate at which new patients arrive over the period concerned iii) mean duration of consultations You should assess the impact of the two proposed interventions on: (a) mean wait time (b) mean queue length (c) proportion of time each member of staff is engaged in assessing or treating patients (including completion of notes). You will also need to consider how to model triage, e.g. how long does the triage process take? What proportion of patients is discharged or transferred at triage? You may choose to restrict your attention to the performance of the clinic at peak times. Briefing The Borchester Urgent Care Centre (UCC) is based in a former hospital building and is used to deliver a variety of community services. Borchester UCC provides treatment for any illnesses or injuries which aren’t life threatening, but still need treating quickly such as: minor head injuries, suspected broken bones and fractures, sprains, cuts and scrapes, bites, eye problems and rising temperatures. Staff at Borchester UCC are able to request X-rays and some other imaging investigations. Staffing levels The clinical staffing at Borchester UCC consists of a GP and an emergency nurse practitioner at all times. The Emergency Nurse Practitioner works autonomously and can see a wide range of conditions. Consultant cover is provided by Borchester A&E department but because of other commitments is rarely physically based in the UCC. Activity levels Activity levels at the unit are, overall, lower than was anticipated when the unit was set up. Nevertheless there have been complaints that at certain times patients are faced with long queues and some choose to leave before being seen, often attending at the local A&E or making use of other out-of-hours services. Triage For options that incorporate use of a triage nurse, patients with cuts, grazes and bruises would be dealt with by the triage nurse and sent home and patients with circulatory or obstetric problems would be directed immediately to the local emergency department. Triage that does not involve any treatment is estimated to take five minutes on average. Data You are provided with a dataset containing information about the patients attending the clinic, including arrival time, time they are seen and discharged, problem, service time and any supplementary notes for the year 2024. Report Your submission should be in the form of a formal report of no more than 2,500 words in length and should include the following elements (the number of words are guidelines): • Executive Summary ~250 words This should be a short summary of the key elements of the report. • Introduction ~500 words The introduction should summarise the context of the work and justify the approach taken. • Method ~750 words The methods section should include a detailed account of how the data were used. A detailed account of the assumptions made and of any values that had to be estimated should be part of this section. • Results ~500 words Results should be summarised clearly in words and, where appropriate, in a table or figure. Think about what form of presentation is most appropriate. Be careful not to include so much data or model output that your key messages are obscured. • Discussion ~300 words The discussion section should include a clear statement of what the limitations of the work are and a discussion of the results leading to the conclusion proposed in the recommendation section. • Recommendations ~200 words The option recommended must clearly stated and based on the analysis. Any risk associated with the recommended option and possible mitigation strategies should be presented as well. Use of AI for the assessment The assessment for PHDA falls in category 2 of the use of AI tools in assessment (the description of the different categories can be found here). You can use AI in an assistive role (e.g. proofreading, testing code), but the work and text that you submit must be your own. If you make use of AI tools, this should be declared in your report just as you will do with any other tool you use, e.g. Excel, Tableau, ... Assessment Criteria Section Criteria Total marks Executive Summary • Short and clear summary of the key findings of the report 5% Introduction • Evidence of knowledge on the clinical area and problem to be addressed • Motivation for the work is clearly articulated. • Justification for the approach taken is provided. 10% Methods • Evidence of a precise account of the methods used, and understanding of why such methods are appropriate to answer the clinical question. • Demonstrates an ability to evaluate the limitations/ assumptions of any methods used. • Sufficient detail is given to allow others to replicate the work. 25% Results • Ability to summarise results precisely • Demonstrates an ability to present and describe model outputs in a manner that would allow interpretation by a range of stakeholders. • Evidence of knowledge surrounding the principles of data visualisation where appropriate 20% Discussion • Ability to construct a coherent and clearly structured narrative to arrive at a decision based on data • Evidence of understanding the limitations of the work, including how such limitations might affect study results 20% Recommendations • Demonstrates an ability to translate analytical findings into recommendations for clinical practice • Identification of risks and how they are mitigated 10% Presentation of the report • Clarity of presentation (including appropriate use of appropriate, well-selected references) • Ability to present information using report format 10%
Java Lab 24: Robot Delivery Simulation This lab has been designed to give you practice using threads and synchronization. Problem statement: This application will simulate a system of autonomous robots delivering packages. The central system, simulating a warehouse, will create Parcel objects and place them on a queue for delivery, where each parcel will have a pseudo-random delivery time to simulate the time it would take to deliver this package – that is, the distance from the warehouse to the parcel's destination. Each robot will take a parcel from the queue and "deliver" it by waiting the delivery time before returning to the queue for another parcel. Each robot will have a battery life (set at 100 time units): it can only operate for that long until it needs to be recharged (which this app won't simulate – once the robot's battery dies, that's it – but to keep things simple, the robot will magically deliver the last package). When all the robots' batteries die, the simulation is over and the statistics of this run of the simulation are printed. Figure 1 shows one run of the program: the user chose 2 robots. Each column shows the parcel # that a robot delivered and the delivery time for that parcel. Here, Robot 0 delivers parcels 0, 3, 5, 6, 7, 9, 12, and 13 before its battery dies. Note that it exceeded its battery life (100) by 11 units – that’s okay, because again, we’re assuming the last package (#13, requiring 20 time units) gets delivered. Robot 1 delivers parcels 1, 2, 4, 8, 10, 11, and 14 before its battery died. Again, Robot 1 exceeded its battery life by 17 units, delivering package #14 by magic. The interleaving should make sense: while Robot 0 was out delivering parcel 0 (which took 22 time units), Robot 1 delivered parcel 1 (which took 14) and picked up parcel 2 (12); while parcel 2 was in transit, Robot 0 picked up parcel 3, and so on. The warehouse thread placed those 15 packages, plus 10 more that did not get delivered, on the queue. Figure 2 shows a run using 3 robots. After you code the missing parts, try running your code with 2 robots several times; of course, when you run your program, you’ll get different answers, because of the random parcel timing. The try it with 3, 4, and 5 robots. Design: The main program will prompt the user for the number of robots to create, call setup() to create the data structures, the robots and their threads, and start the robot threads. Main will then create the warehouse simulation thread. That thread will new-up the data structures required for the simulation. To give you practice with four kinds of synchronization tools, you'll use the following. A thread-safe queue (BlockingQueue) will be used for parcel storage; the warehouse will create Parcel objects (which contain a random time-to-deliver value) and place them on the queue. Use offer() and take() to put and remove parcel objects. Robots will remove parcels from the queue and wait for the time-to-deliver amount to simulated deliver. This queue will be a shared structure among the warehouse and robots, so all will keep a reference to the same object. The synchronized Stats class (all of its methods are tagged with synchronized) will also be shared; it will keep track of the number of parcels delivered by each robot, the total delivery time for the parcels it delivered, and the actual time used by the robot. You will need to make its methods synchronized to make this data thread-safe. A static AtomicInteger counter will be shared as a static global item. It will be initialized to the number of robots. As each robot's battery dies, the counter will be decremented. The warehouse will stop producing parcels when the counter reaches zero. To make sure that the robots share the screen without problems, synchronize the code block that prints their delivery message on the stats object. Design Class Lab24Main: main(): prompt for the number of robots; print a header for the robot output; call setup(), and create and start the warehouse thread. ALREADY CODED. setup(): initialize the queue, stats, and counter objects; create the robot and thread arrays; create the robots and threads; start the robot threads. MOSTLY CODED – NEEDS FINAL LOOP. run(): this is the warehouse thread. Until the counter reaches 0, it will create a new Parcel object every five time units (by calling Thread.sleep(5) before creating the next parcel) and place the parcel on the queue. When the loop is done (all robots have finished), join all the threads, then call printStats(). YOU MUST CODE THE RUN LOOP. printStats(): displays a chart of the robot and warehouse statistics. See Figures 1 and 2. ALREADY CODED. Class Parcel: ALREADY CODED. Parcel( ): initializes the parcels' id and deliveryTime getDeliveryTime(), getId(): getters. Class Robot: Robot(BlockingQueue, Stats): sets the references to the shared data and sets the id. ALREADY CODED. run( ): initialize beginTime; then loop: while it's battery is still charged (greater than zero), do the following: - get a parcel from the queue, - Thread.sleep( ) for the parcel's delivery time, - decrement the battery, print the parcel delivery information (see Figures 1 and 2), - print the delivery message, synchronized on stats – this is ALREADY CODED. - add to the stats for this robot. When the battery loop is done – we'll assume that the last package actually was delivered, even if technically the battery ran out during delivery (say it has a bit of reserve power) – set endTime; set the stats' running time for this robot, then decrement the AtomicInteger counter by one. Class Stats: ALREADY CODED. Stats(int numberOfRobots) : new up the data stuctures. synchronized putParcel(int robotNumber): increment the parcel counter for this robot. synchronized putTime(int robotNumber, int time): increment the time counter for this robot by time (sums all parcel times for this robot) synchronized putRobotTime(int robotNumber, long time): set the robot's running time synchronized getParcel(int robotNumber), getTimes(int robotNumber), getRobotTimes(int robotNumber): getters for this robot's parcel count, total parcel delivery time, robot running time. Deliverable: Zip up *all* the .java files as _lab24.zip.
EIE2105 Digital and Computer Systems Tutorial 7: Microprocessor Design Basics and Instruction Set Architecture Q1. Figure Q1a shows part of the memory space of an 8088 system. It is given that the current content of registers DS, SI, AX and BX are, respectively, 10H, 10H, 60H and EF0H. Figure Q1a Figure Q1b Trace the assembly language program segment shown in Figure Q1b. Show the contents of registers AX and CX after the execution of each instruction. Q2. Explain how the bus width of a microprocessor’s address bus and data bus affects the system performance of a computer system. Q3. A program is compiled to generate the follow executable machine code sequence. 78H 56H 32H 87H 23H 98H 42H 11H 05H … The program is then loaded to the main memory of an 8086-based computer system starting from address 5000H as shown in Fig.Q3 for being executed. Fig Q3 Based on the available information, answer the following questions. (a) What is the logical address of the datum stored in memory location 5002H? (b) What is the physical address of the datum stored in memory location 5003H? (c) What is the logical address of the 5th byte of the program (i.e., 23H)? (d) What is the physical address of the 5th byte of the program? (e) What is the physical address of the 5th byte of the program before the program is being loaded into the main memory? (f) Data A is a word in the program and it is stored in memory locations 5006H and 5007H, determine its value. (g) Compare the life cycle of data A’s logical address and physical address. Which one is longer? (h) If the microprocessor of the system wants to access data A, which address signal (in value) will be delivered through the address bus? Is it a physical address? Q4. How does the X86 system know the fetched binary pattern is an instruction or not? Q5. Fill in the blank with the following keywords: [mainframe. computer] [secondary storage] [primary storage] [conditional branch] [register] [server] [grid] [minicomputer] [Control unit] [server] [Application] [instruction] [registers] [memory] [I/O device] [System] [computer network] [bus] [cluster] [ALU] [programming language] [general-purpose] [RAM] [operating system] 1. A(n) ________ generally supports more simultaneous users than a(n) ________. Both are designed to support more than one user. 2. A(n) __________ is a storage location implemented in the CPU. 3. The term _______ refers to storage devices, not located in the CPU, that hold instructions and data of currently running programs. 4. A problem-solving procedure that requires executing one or more comparison and branch instructions is called a(n) _________. 5. A(n) ________ is a command to the CPU to perform one processing function on one or more data inputs. 6. The term ________ describes the collection of storage devices that hold large quantities of data for long periods. 7. A(n) __________ is a computer that manages shared resources and allows other computers to access them through a network. 8. The major components ofa CPU are the _______, _______, and _______. 9. Primary storage can also be called _______ and is generally implemented with __________. 10. A set of instructions that is executed to solve a specific problem is called a(n) _________. 11. A(n) __________ is a group of similar or identical computers, connected by a high-speed network, that cooperate to provide services or run an application. 12. A(n) ___________ is a group of dissimilar computer systems, connected by a high-speed network, that cooperate to provide services or run a shared application. 13. A(n) ___________ consists of computing resources with a Web-based front-end interface to a large collection of computing and data resources. 14. A(n) ___________ is a hardware device that enables a computer to communicate with users or other computers. 15. A CPU is a(n) _________ processor capable of performing many different tasks simply by changing the program. 16. The ________ is the “plumbing” that connects all computer system components. 17. Most programs are written in a(n) ________, such as C or Java, which is then translated into equivalent CPU instructions.
EMET8002 Case Studies in Applied Economic Analysis and Econometrics Semester 1 2025 Computer Lab in Week 5 This week we continue to use the 2015 Programme for International Student Assessment (PISA) dataset for our analysis. PISA assesses skills of 15-year-old students across many countries and involves data on student performance (reading, mathematics and science knowledge), as well as parent, school and teacher questionnaires. The data is publicly available and can be downloaded from the PISA website under the section “SPSS (TM) Data Files (compressed)” (https://www.oecd.org/pisa/data/2015database/). However, these are large files and take some time to download and convert to Stata format. Therefore, we have prepared a smaller merged dataset of the Student and School questionnaire data files which now only includes the relevant variables. You can download this data (“Week4_PISA_data.dta”) from Wattle. Question 1: Multinomial Logit Regression We will run multinomial logit models with the following categorical variable as the dependent variable: “org_type” (“What kind of organisation runs your school?”). (a) Exclude missing values for the following variables: repeated, school_size, age, male, international_lan, mother_edu, father_edu, quiet_study, good_listener, add_math, class_size, computer. (b) Tabulate the variable “org_type”. How do you interpret the output? Create a label for the variable and values of “org_type” to make interpretation easier and tabulate the same variable again. [Hint 1 : -1 is coded for missing data, 1 means “A church or other religious organisation”, 2 means “Another not-for-profit organisation”, and 3 means “A for-profit organisation”; Hint 2: type in “help label” in Stata for more suggestions]. (c) Run a multinomial logit regression with “org_type” as the dependent variable and the following independent variables: “male”, “age”, “international_lan”, “mother_edu”, “father_edu”. Since the independent variables may contain missing values, include dummy variables that are equal to 1 if there is a missing value and 0 otherwise. [Hint: see “help mlogit” for suggestions] (d) Why does Stata omit the following two dummy variables: “male_m” and “age_m”? Run the multinomial logit regression from part c without these two dummy variables. (e) Calculate the relative risk ratios. How do you interpret the output? [Hint: see “help mlogit” for suggestions] (f) Calculate predicted probabilities for a student to be at a church school, at a not-for- profit organisation and at a for-profit organisation if they are (i) male and 15 years old, (ii) female and 15 years old, (iii) male and 16 years old, and (iv) female and 16 years old. How do you interpret the output? [Hint: see “help margins” for suggestions] (g) Compute the marginal effects for the “male” variable for each of the three possible outcomes ofthe dependent variable (church school, not for profit organisation, or for profit organisation). How do you interpret the estimated coefficients? [Hint: see “help margins” for suggestions] (h) How are predicted probabilities and marginal effects related? Question 2: Ordered Logit Regression For this question we use the variable “good_listener” as our dependent variable. This is a categorical variable which ranges between 1 and 4 with 1 being a poor listener and higher values indicating higher listening abilities. (a) Explain how this dependent variable is different from “org_type” in Question 1 and why we would use an ordered logit regression in this case. (b) Tabulate the variable “good_listener” and interpret the table. What does the value -1 mean? (c) Run an ordered logit regression with “good_listener” as the dependent variable and the following independent variables: “male”, “class_size”, “computer”, “international_lan” and “age”. How do you deal with missing values in the dependent and independent variables? Interpret the output. [Hint: see “help ologit” for suggestions] (d) Compute odds ratios for the same regression as in part c. How do you interpret the output? [Hint: see “help ologit” for suggestions] (e) Calculate predicted probabilities for each of the four possible outcomes of “good_listener” for males versus females. Interpret the output. (f) Do a Brant test to check if the assumption of proportional odds that is required for an ordered logit model holds. Would you still use an ordered logit model? What are the alternatives? [Hint: see “help brant” for suggestions. Most likely, you will need to install the package spost13_ado.pkg, which you can find by typing “findit spost13_ado” in Stata] Question 3: Preparation for Research Project [not required for problem set] We strongly recommend that you have chosen a paper by now, as the research proposal is due at the end of week 6, and the data collection process can also take some time. Students are also expected to have started the data application process by now. If you are unsure of the process, ask for help. (a) Will you need any additional data for your extension? This is optional, as some extensions can be done using the same data as for your replication. (b) In small groups, discuss what extensions you are planning for your research projects and why they are meaningful, interesting and worth investigating. Give feedback to each other about ways to improve the motivation of your extensions. (c) Discuss the ethical issues and errors associated with the use of AI-assisted technologies. Note that for this course, the use of AI tools constitutes a breach of academic integrity. (The only exception is when students use Chat GPT or other AI tools when proof-reading, in which case this needs to be disclosed in a statement at the end of your assignment, and you may be asked to provide drafts from before and after the proof-read). If you are unsure of what is appropriate, ask for help.
FIT5147 Data Exploration and Visualisation Semester 1, 2025 Programming Exercise 1: Tableau (5%) Please carefully review all the requirements below to ensure you have a good understanding of what is required for your assessment. 1. Instructions &Brief 2. Assessment Resources 3. Assessment Criteria 4. How to Submit 5. Word Count & Penalties 1. Instructions & Brief In this assignment you are required to read in some data and explore and visualise it using Tableau Public/Desktop, then also submit a brief report showing your findings and the visualisations you used. It is an individual assignment and worth 5% of your total mark for FIT5147. Relevant learning outcomes for FIT5147: 1. Perform. exploratory data analysis using a range of visualisation tools; 6. Implement interactive data visualisations using R and other tools THE DATA: The data set used in this assignment is based on the AusStage online resource. It is “a data set of live Events with dramatic and performance content covering all of Australia and New Zealand plus many additional International links” (AusStage, n.d.) and is regularly updated. While the data we are using was collected in February 2025, it includes both recent events as well as ones from previous centuries. We introduced this dataset in the Week 1 Workshop. For this assignment, use the provided PE1 dataset to produce your Tableau visualisations and visual analysis. It is based on that found in AusStage but has been slightly modified. To enhance your understanding of the context and metadata, you can check the data source link. Using the various interactive tools for the data source of the full dataset may help enrich your visual analysis: https://www.ausstage.edu.au/pages/learn/search-ausstage . If you discuss or replicate the visualisations or metadata provided by AusStage, be sure to reference these correctly in your report1. Please note that the event and performance names relate to real Australian performance art and culture. Some names may have some explicit terms. For this PE1 assignment, the resulting data describes when and where events occurred in the state of Victoria. In this activity we will explore the use of a few attributes: Column Description EventName The title ornameofanEvent. EventIdentifier Auniquenumberidentifying anEventin AusStage. FirstDate Year The year of theEvent's firstpublicpresentation,includingpreviews FirstDate The year(and day ormonthifknown) of theEvent'sfirstpublicpresentation,includingpreviews LastDate The year(and day ormonthifknown) of theEvent'sfinalpublicpresentation. VenueName Thename of the Venue where aneventhappens. VenueIdentifier Auniquenumberidentifying the Venue where an eventhappens. Suburb The suburb orlocal districtwhere theEventhappens. State The Australian state or territory where theEventhappens. Country The country where theEventhappens. Primary Genre Thekind ofEvent, as definedbyitsmainmode ofperformance. Organisations Thename of the organisation/s associated withanEvent. Contributor Count Number of contributingpeoplerecordedin AusStage for thisEvent. Resources Count Number ofrelatedresourcesrecordedin AusStage for thisEvent Longitude GeographicalLocation(longitude) of the Venue Latitude GeographicalLocation (latitude) of the Venue Table 1: Fields of the “AusStage_S12025PE1” data set In this data there are some irregularities or errors that were part of the original data. One of the requirements of this assignment is for you to find (using data visualisation), describe and handle them. This modified dataset can be found on Moodle in the Assessments section under the Programming Exercise 1 heading. References: AusStage. (n.d.). AusStage: About. AusStage. Retrieved February 27, 2025, from http://www.ausstage.edu.au/pages/learn/about VISUAL ANALYSIS QUESTIONS TO BE ADDRESSED Using the data and visual analytics, you will need to answer the following questions: 1A. What are the most common event names? To answer this question, discuss how you are going to identify, measure and visualise what are the most common events. 1B. How many events started each year over the last 25 years? To answer this question, discuss how you are going to identify, measure and visualise the number of events started each year. 1C. How many events were run or performed by each organisation? To answer this question, treat each Organisations value as a single organisation, even if it includes different groups. Discuss how you are going to measure and visualise the number of events for each organisation. 1D. How many organisations started events each year over the last 25 years? To answer this question, discuss how you are going to identify, measure and visualise the number of events started each year. 1E. How long did each event run for? To answer this question, discuss how you are going to identify, measure and visualise the number of years each event ran in. 2. Based on the visualisations and findings for 1A-E, is it possible for you to now explain who (i.e., Organisation) ran or performed what events over the last 25 years? For this question, you need to discuss whether your visual analytics for 1A-E have enabled you to answer this question or not. Be sure to explain how you came to that conclusion. ASSESSMENT TASK The task has two components: data exploration using Tableau, and a short written report. Data Exploration using Tableau: The steps you are expected to complete: 1. Load the dataset in Tableau Public/Desktop 2. Use data visualisation in Tableau to check for and find at least two aforementioned irregularities in the dataset. Each type of irregularity may occur multiple times in the data. These irregularities are not related to missing data. 3. Amend the data to correct these errors using any tool of your choice (e.g., Excel, Python, R, Tableau) and justify your choice of correction based on the irregularity. 4. Use Tableau to create at least one visualisation per question (not more than 2 per question) to conduct your visual analysis and answer the above question. Remember to select appropriate visual variables to suit the data and your chosen visualisation. 5. Polish up your visualisations for presentation, e.g., add a suitable title, correctly label your axis, make sure labels and values are not truncated, include a legend. Ensure the font, font size and colour are suitable and legible for your report. 6. Write a report that presents and describes your data exploration process and visual analysis. See below for details. This exploration must be submitted as a Tableau workbook file (*twb suffix). Note: Indications of missing data like UNKNOWN/unknown, NULL/null, N/A, tba values should not be regarded as irregularities for this assignment. If Tableau has any issues automatically recognising any date or time information, then this is not to be regarded as an irregularity for Step 2 but can be corrected by you in Tableau. Written Report Once you have finished your data exploration, write a report that contains the following information: 1. Data loading, checking and cleaning (i.e., Steps 1 to 3) o A brief explanation (maximum of one paragraph per error) and an accompanying image of each of the errors or irregularities that you have found, showing how you found them using Tableau, and explaining & justifying how you resolved them. The image must show a relevant visualisation, not just the data or a table. 2. Data Exploration and Presentation (i.e., Steps 4 and 5) o Explanation of what insights you have found out through the visual analysis in order to answer the questions. This should include: ■ Your answer to the question, based on your visualisation(s). Include relevant visualisations in 1 or 2 figures per question. . Description of your visualisation(s) and how they relate to the data and question (i.e. why it is an appropriate visualisation choice) . Justification of your visualisation(s) and choice of visual variables . Any further insights, or issues that you have identified from the data or visualisation(s) while answering the question. The report should also: ● Be submitted as a PDF file ● Be no more than 5 pages in length, including figures, with a minimum font size of 10 (title page and any table of contents are excluded from the page limit) ● Be properly structured with headings, subheadings, figure captions (in-text referencing of captions), page numbers, and references (where appropriate) ● Have high quality images of your visualisations with clearly readable and legible text/labels (presume that it is read as part of an A4 document with no zooming). ● You must use proper academic referencing for all reports in this unit. This should follow either the APA or IEEE structure as recommended by the Faculty. Use the library referencing guide for support. ● Not include any code snippets except for key Calculated Fields in Tableau. ● No Generative AI software or system may be used to complete this assessment task. This includes using any software that paraphrases, translates or rewrites your text. 2. Assessment Resources ● AusStage_S12025PE1.csv (Available on Moodle) 3. Assessment Criteria The following outlines the criteria which you will be assessed against. The focus of the marker will be on what you have included in your report, but your submitted Tableau Workbook may be examined if there are any concerns with the academic integrity of your work. ● Demonstrated ability to check and clean data and read into Tableau [1%] ● Demonstrated ability to appropriately visualise data for data exploration using Tableau [2%] ● Demonstrated ability to see trends/patterns in data [1%] ● Quality of report [1%] 4. How to Submit Once you have completed your work, take the following steps to submit your work. 1. Save your report as a .pdf file. 2. Name your file using the following structure PE1_Surname_StudentID 3. Save your Tableau workbook as a .twb file. 4. Compress the .twb workbook file into a .zip file so it can be submitted to Moodle. DO NOT include your report in your zip file, only your Tableau workbook. 5. Name your zip file using the following structure PE1_Surname_StudentID 6. Click the Add Submission button on Moodle to submit and upload your report and workbook Please note that your assignment MUST show a status of "Submitted for grading" before it can be marked. Any submission left in draft mode will not be marked. We recommend always double checking your submission has been completed and that you have uploaded the correct files. Penalties will apply to any submission which needs amendment after the deadline. 5. Word Count & Late Penalty The report must not be more than 5 pages of graded material including figures (min. font size 10). Up to 2 additional pages may be used if you wish, but restricted to: ● 1 page prior to the report as a title page with a table of content. ● 1 page after the report only for references. 1 mark (out of the total of 5) will be deducted if the report does not meet these requirements. As per Monash policy: All late submissions will receive a penalty of 5% per day (0.25 marks per day out of a total of 5 marks) late inclusive, including weekends. Work submitted more than seven days after the due date will not be marked.
School Name School of Intelligent Finance and Business Module Name & Code Supply Chain Risk Management/IFB213TC Academic Year/Semester AY24-25 / Sem 2 Type of Assessment Coursework Weight of Assessment 100% of the final grade for this module First Attempt/Resit First Attempt Submission Date 5pm, 7 April 2025 Word Count 1500 (+/-10%) COURSEWORK Read the case study on Golden Star Ltd and then complete the required tasks. Golden Star Ltd is a private petrochemical company headquartered in the city of Qingdao, China. It is one of the so-called “teapot” oil refiners and operates two oil refineries along the coastline of Shandong province. Most of the petrochemical products produced by Golden Star Ltd are sold to domestic customers. Because the domestic economy of China is slowing down, the People’s Bank of China (the central bank) has started to ease its monetary policy to stimulate economic growth. Monetary easing is likely to cause devaluation of Chinese Yen again the US dollar and Euro, which are used by Golden Star Ltd to buy crude oil from foreign suppliers. All of the crude oil used by Golden Star Ltd are imported. Around 30% of the crude oil imports come from Iran and Russia; the rest are from Angola, Saudi Arabia, Malaysia, Indonesia and United Arab Emirates. Currently, transactions with Iran and Russia have not been affected by the sanctions imposed by the US and EU because they are not settled in Euro or US dollar. However, there is a risk that in the future that the US and EU may tighten the sanctions on Iran and Russia. For example, if both regimes decide to strictly enforce secondary sanctions on the entities buying crude oil from Iran and Russia, Golden Star Ltd will have no choice but to stop importing crude oil from these two countries because its transactions with the other supplying nations are settled in US dollar or Euro. Also, the political instability in the Middle East and the on-going Russo-Ukraine conflict have made the global oil price more volatile in the past five years. For example, a drone attack to a major oil installation in Saudi Arabia in 2019 caused a benchmark oil price, the Brent Crude Futures price, to increase by almost 20% in two days. In the past two years, Golden Star Ltd has been working with a local university to improve the refining process used in its refineries. The CEO of Golden Star Ltd plans to introduce an improved refining process in the next three months. If this process is successfully implemented, the yield on high-value-added products, such as gasoline, will increase by 10%. However, past experience suggests that the implementation of a new process is not always smooth. For example, at the early stage of the implementation, Golden Star may encounter temporary disruptions to the normal production in its refineries. Golden Star Ltd did not have any major accident in the past 5 years. However, recent inspections of its refineries and warehouses by local authorities identified several deficiencies in its compliance with the local health and safety regulations. Some of the deficiencies, if not quickly addressed, may lead to catastrophic accidents, such as fires and explosions. Local authorities fined Golden Star CNY 2 million for the compliance issues and required it to take immediate actions to improve its compliance. Required Tasks Task 1 Use the Failure Modes and Effects Analysis (FMEA) model to analyze the potential failures in the supply chain of Golden Star Ltd: a) Identify the failure types and the potential cause(s) and effect(s) of the failures. [20 marks] b) Estimate and then rank the Risk Priority Numbers (RPNs) of the potential failures identified in a). Note that you should justify the ratings assigned to the three components of RPNs (i.e., Occurrence, Severity and Detectability). [20 marks] c) Recommend some control measures that could reduce the ratings of the three components of RPNs calculated in b) and then re-estimate the RPNs, assuming that your recommended measures will be successfully implemented. [20 marks] Task 2 Apart from the potential risks revealed by the case study, identify any two potential ESG risks in the supply chain of a petrochemical company like Golden Star Ltd and then briefly discuss how to mitigate these risks. [20 marks] INDICTIVE GRADING SCHEME 1. Presentation of the Coursework (including writing quality; effective and accurate use of statistics, tables, diagrams, or/and illustrations and word count/page numbers) [10 marks] 2. Content of the Coursework: § Task 1 (three sections; each with 20 marks) [60 marks] § Task 2 [20 marks] 3. Referencing, Citations and Independent Research (including correct application of the Harvard citation style. and evidence of independent research) [10 marks] [100 marks in total] Performance Descriptors: Grade Point Scale Criteria A 80+ Ø Excellent execution of the assessment brief Ø Evidence of extensive independent research; many relevant, suitable references have been consulted to write the coursework Ø Mastery of relevant subjects/topics at the level that is beyond what has been explicitly taught Ø Excellent application of the Harvard citation style Ø Excellent ability to synthesize information, construct arguments and perform. critical analysis Ø Excellent presentation & writing skills; the coursework focuses on the tasks and is well-structured Ø Potentially worthy of publication B 70 - 79 Ø Excellent execution of the assessment brief Ø Evidence of significant independent research; adequate number of relevant, suitable references have been consulted to write the coursework Ø Mastery of relevant subjects/topics Ø Very good presentation; only one or two minor deficiencies Ø Commendable ability to synthesize, construct arguments and perform. critical analysis Ø Very good writing skills; the coursework focuses on the tasks and is well-structured Ø Very good application of the Harvard Referencing Style; only one or two citation mistakes. C 60 - 69 Ø Very good execution of the assessment brief; only a few omissions and minor mistakes Ø Evidence of independent research; adequate number of references have been consulted to write the coursework although a few of them are not directly relevant to the course work or not from ideal sources Ø Good command of relevant subjects/topics Ø Good presentation although some areas could be improved Ø Able to synthesize, construct arguments and perform. critical analysis Ø Good command of English although there are a few grammatical, typographical or/and punctuational errors; the coursework focuses on the tasks and is well-structured Ø Good application of the Harvard Referencing Style; only a few citation mistakes D 50- 59 Ø Good execution of the assessment brief; but there are some omissions and mistakes Ø Evidence of independent research; however, the number of references or/and the quality of some references falls short Ø Adequate understanding of relevant subjects/topics although there are a few learning gaps Ø Satisfactory presentation although there are some deficiencies Ø Able to synthesize, construct arguments and perform. critical analysis; but some parts of the coursework are descriptive Ø Adequate command of English although there are some grammatical, typographical or/and punctuational errors; some deficiencies in the structure or/and the writing sometimes deviates from the tasks Ø Satisfactory application of the Harvard Referencing Style; but there are some citation mistakes E 40 - 49 Ø Marginally satisfactory execution of the brief; there are significant omissions and/mistakes Ø Inadequate independent research; significant shortfalls in the number and/or quality of references Ø Marginally satisfactory understanding of relevant subjects/topics; there are some noticeable learning gaps Ø Marginally satisfactory presentation but there are noticeable deficiencies Ø Little effort to synthesise, critically evaluate and analyse; some sections are very descriptive Ø Attempted to apply the Harvard Referencing Style; but there are many citation mistakes Ø There are many grammatical, typographical or/and punctuational errors; the writing lacks focus and/or structure F 0 - 39 Ø Poor execution of the brief; there are many significant mistakes and/or commissions Ø Little evidence of inadequate independent research and little effort to incorporate relevant references into writing Ø Unsatisfactory understanding of relevant subjects/topics Ø Unsatisfactory presentation; there are many significant deficiencies Ø Very poor command of English; there are a lot of many grammatical, typographical or/and punctuational errors Ø The coursework is ill-structured and most sections have little connections to the tasks Ø No attempt to apply the Harvard citation style
SEMESTER 2 2024/25 COURSEWORK BRIEF: Module Code: MANG6537 Assessment: Individual Coursework Weighting: 100% Module Title: Project Management Submission Due Date: @ 16:00 Word Count: 2,000 This assessment relates to the following module learning outcomes: A. Knowledge and Understanding A1. Appreciate the specific nature of projects and its implications on managing projects. A2. Gain the knowledge of traditional and contemporary approaches to management of projects. A3. Understand the people, organisational and functional aspects of projects and their strategic management and leadership. A4. Gain knowledge of tools and techniques for organising, managing, and leading people, organisational and functional aspects of projects. A5. Develop in-depth appreciation of various factors influencing project success and failure. B. Subject Specific Intellectual and Research Skills B1. Appreciate the sources of complexities in actual projects and project environments. B2. Analyse project situations and dynamics considering the pluralistic nature of projects and project management. B3. Critically analyse projects’ context, potential dynamics, and challenges regardless of their industry. C. Transferable and Generic Skills C1. Appreciate the complexity of actual business world. C2. Develop analytical and critical approaches to management and leadership. C3. Organise, present, and communicate your professional and academic views in written. C2. communicate effectively (written). 3 Coursework Brief: The coursework intends to assess the depth and breadth of your appreciation, reflection, and critical analysis of project management. For this assignment, you are required to select one recent, impactful project and conduct in-depth analysis using topics and perspectives covered in this module. You are expected to: • demonstrate good capability of collecting secondary data on your chosen project from valid sources (e.g., official reports, the project’s or the organisation’s website, reputable news agency, etc.) • review their project management approach • identify the key success and/or failure in the project practice, and • reflect on whether and how different approaches could have led to better outcomes. Suggested Structure • Title Page (module code, assignment title, student ID, and word count) • Table of Contents (list of numbered headings and sub-headings) • Introduction to the project and purpose of assignment (300 words) • Academic literature review (on relevant project management approaches, previous study on projects and/or organisations in similar contexts) (700 words) • Analysis and discussion (700 words) • Conclusion (300 words) • List of References • Appendices (if necessary): Details of additional relevant information. Other Key Requirements • +/-10% either side of the word count is deemed to be acceptable. Any text that exceeds an additional 10% will not attract any marks. • Word count excludes text in tables, figures, the list of references, and appendices. • Rigid application of required academic referencing style. • Main text should be presented using 12 font size, Times New Roman or Arial, and 1.5 line spacing.
SOC202H1S Introduction to Quantitative Research Methods Due date,submission,and instructions Please submit an electronic copy on Quercus in a single document that contains all lab components.Please read the detailed instructions about the expected content and formatting of these lab assignments in the document Lab assignment instructions available on Quercus. Learning Objectives In this tutorial you will: ● Learn how to create a new variable based on variables in the data set ● Practice formulating hypotheses ● Use SPSS to conduct a two-sample means test ● Use SPSS to graphically compare the means(and confidence intervals)of two groups: Learn how to produce an error bar graph ● Practice interpreting the results of hypothesis tests Data The exercise assigned to this week are based on the General Social Survey(GSS).You can download the"GSS2013_shortened.sav"data file from the course website on Quercus(go to "Tutorials">"Datasets"). Exercise 1 (90%of mark) The research question you will be investigating this week is: Does the average number of friends differ between men and women? a) For this exercise,you are going to create a new variable capturing respondents'number of friends based on the"Number of Close friends [scf_100c]"and the"Number of other friends [scf_110c]").Generate the new variable by adding two existing variables following the instructions in the tutorial. Check vour work:Calculate basic descriptive statistics(mean,standard deviation, minimum and maximum).What is the lowest and what is the highest number of friends? What is the average number of friends?Do these values look plausible? b) Calculate the mean number of friends for women and men.Instruction:Go to Analyze> Compare Means>Means..>move the variable"Number of friends"into the box "Dependent List"and the variable"Sex of respondent"(sex)into the "Layer 1 of1"box> click on OK.Include this descriptive table in your report. ● write a sentence of interpretation about how these sample means compare ● explain why we need to conduct a two-sample significance test when we can tell whether there is a difference just by examining the two means in the table c) Following the instructions in the SPSS DEMONSTRATION 11.2(p.381),produce an error bar graph comparing women and men's average number of friends.Interpret the graph the point estimates and confidence intervals,separately (one sentence each).Then,interpret both of the graphs together(one sentence only).(That's three sentences in total:One for men,one for women,and one for both of them together) Next,you are going to conduct a two-sided two-sample test for means at the.05 significance level. d) State the hypotheses mathematically and literally. e) Perform. a two-sample means test using Demonstration 11.3(p.382). In order to conduct this hypothesis test,you need to know what the numerical values are that are assigned to the categories "male"and"female"on the variable sex.To find this out,consult the GSS CODE BOOK. f) State whether you reject or fail to reject the null hypothesis H₀you formulated in part d)at the .05 significance level. g) Provide a verbal interpretation of the rejection decision. h) How do the findings from the two-sample means test compare to the findings based on the error bar graph in part e)?(1-2 sentences) i) What questions do these findings raise?Why might this finding come about?(this is, of course,somewhat speculative.Please limit this to 3 sentences). Exercise 2 (10%of mark) Find the variables cerd230c and grp_40.Familiarize yourself with the codebook entries for both of these variables,especially the missing values. a)Provide frequency tables for the two variables.If someone states"None"for cerd230c what appears to happen to their response for grp_40? b)Run a t-test investigating the effect of gender on grp_40 and provide the output before you recode any missing variable. c)Now run a t-test investigating the effect of gender on grp_40 and provide the output after you recode any individuals who answer"None"for cerd230c to an interpretable value on grp_40.(see the reading notes for more information about this) d)What is the difference in interpretation between these two t-tests?Which one seems more useful in this case?(2-3 sentences) Learning Objectives In these exercises,you will: ● Practice how to interpret bivariate regression analysis,including the regression slope,the constant, the correlation coefficient rand the coefficient of determination Data You may use either the GSS or the Canadian Community Health Survey for this assignment. Exercise 1(20%of mark) You may find the SPSS DEMONSTRATION 13.2 at the end of Chapter 13 in the section"You Are the Researcher"(p.447)useful for this section. Read through the entire directions before you choose variables.Note that you will need to make a causal argument,and one of your interval-ratio variables will be a dependent variable in both analyses.Using either the GSS or Canadian Community Health Survey Dataset,you will select one nominal variable that has more than two categories(non-dichotomous)and two interval-ratio variables. a) Use the Descriptives command to get means and standard deviation for two interval ratio variables Discuss these descriptive findings in a couple of sentences.(Hint:You did something similar back on Lab A).Include the table in your answer. b) Use the Frequencies command to get the frequency tables for your nominal variable.Discuss the distribution of the nominal variable and write a couple of sentences about what the distribution means in real-world terms.(Hint:You did something similar back on Lab A).Include the table in your answer. c) Transform. your nominal variable into a new dichotomous variable,with"0"representing"No"and"1" representing"Yes."Make sure this new variable has an easy-to-remember name!Then create a two-way frequency table for your old nominal variable and your new nominal variable to make sure the transformation worked;include the table in your answer. Exercise 2(40%of mark) Examining Theory:X=Nominal,Y=Interval/Ratio a) Generate a causal theory about your nominal variable and your interval/ratio variable.Which one is likely to cause the other,and why?Define xand y.[Note:this should be approximately two sentences;one to explain which one is likely to cause the other,and one to explain why.If you are having trouble coming up with a relatively plausible theory,you may want to select different variables] b) Calculate the correlation coefficient.Report the output and interpret the correlation coefficient(one sentence). c) Run the OLS regression.Report the output,and interpret the regression coefficient and intercept (one sentence for the coefficient,one for the intercept) d) Interpret R-Squared e) Test your theory using all six steps from the video lecture and reading notes.When the regression output gives you what you need,you do not need to calculate it out by hand;simply report what you found from the regression output above. f) Based on the tests of statistical significance and measures of effect size(correlation coefficient, regression coefficient,and R-squared),evaluate your theory.Do not write more than 3-4 sentences Exercise 3(40%of mark) Examining Theory:X=Interval/Ratio,Y=Interval/Ratio a) Generate a causal theory about your two interval/ratio variables,using the same dependent variable as in the previous section.Which one is likely to cause the other,and why?Define xand y.[Note:this should be approximately two or maybe three sentences;one to explain which one is likely to cause the other,and one to explain why.If you are having trouble coming up with a relatively plausible theory,you may want to select different variables.] b) Calculate the correlation coefficient.Report the output and interpret the correlation coefficient(one sentence). c) Run the OLS regression.Report the output,and interpret the regression coefficient and intercept(ne sentence for the coefficient,one for the intercept) d) Interpret R-Squared e) Test your theory using all six steps from the video lecture and reading notes.When the regression output gives you what you need,you do not need to calculate it out by hand;simply report what you found from the regression output above. f) Based on the tests of statistical significance and measures of effect size(correlation coefficient, regression coefficient,and R-squared),evaluate your theory.Do not write more than 3or 4 sentences.
CSIT 5710 Problem Set 1 Problem 2 (4pts) PRF security game. We recall the definition of a secure PRF. Let F:KxX→y be a function with key-space K, input space X and output space y. The PRF security game is played between an adversary A and a challenger C, defined as follow: 1. C picks a bit b {0,1} 2. If b= 0, then C samples k K and sets f ← F(k,.). If b=1, C samples a uniformly random function f: X → y. 3. PRFQuery: A chooses z ∈ X and sends z to C, who replies with f(x). 4. A can repeatedly make PRFQueries to the challenger (repeats step 3). 5. A outputs bit b' ∈ {0,1} and wins if b' = b For a secure PRF, the probability of any probabilistic polynomial-time adversary winning the abovegame is at most 1/2 +e(n), for some negligible function e of security parameter n. (a) (1pt): Let F: {0,1}n× {0,1}n → {0,1}n be a secure PRF. Let F' : {0,1}n × {0,1}n → {0,1}n be the function Is F' a secure PRF? If so, provide a proof for its security; otherwise, describe an attack that adversary A can use to break its security and explain why the attack works. (b) (1pt): Let F: {0,1}n x {0,1}n → {0,1}n be a secure PRF. Let F' : {0,1}n× {0,1}n → {0,1}n be the function Is F' a secure PRF? If so, provide a proof for its security; otherwise, describe an attack that adversary A can use to break its security and explain why the attack works. (c) (2pts): Let F be a secure PRF defined over (K,X,y), where K= X=y= {0,1)". Let Ki = {0,1}n+1. Construct a new PRF Fi over (K1,X,y), with the following property: the PRF Fi is secure; however, if the adversary learns the last bit of the key then the PRF is no longer secure. This shows that leaking even a single bit of the secret key can destroy the PRF security property. You need to provide the construction of Fi, prove it is a secure PRF, and show that if the adversary learns the last bit of the key, then the PRF is no longer secure.
ACP CW2 Specifications (Essay) 12.03.2025 As you implemented CW2 many decisions were made based on specifications and often your own consideration and thoughts. The essay is now the place where you should: 1. Reason and reflect why you chose the specific implementation for the specific tasks 2. How would you test such an environment in and during development? 3. What challenges do you see running this service in terms of load, stability and scaling? 4. Where would you see improvements for your code? This is going to be rather tough as you have much to cover, plenty of considerations to do, in a concise manner, with a hard upper limit set to 1,400 words. As a rough guide, you should aim for a total of between 1,100 and 1,400 words. Writing such a work requires (among other things): · Reasoning and reflection – so less what you did, but why and why not, what influenced you, where did you struggle (and overcame the problem), ... · References, pros & cons, discourse, etc. · A proper formatting which improves the readability of the document (take academic papers from e.g. IEEE as an example) · Readability, which includes full sentences, introductions, conclusions, bridging sentences, ... (in general, give it a nice flow for the reader) Please focus on the core questions asked and do not delve (unless needed and reasoned) into common topics like “why microservices”, “what are unit tests”, etc. Neither are you supposed to write a documentation for your work delivered (this can be done in the code). You will have to find the right balance between mentioning your work and general concepts / considerations. You should not discuss your implementation one item after the other, but give overarching considerations, insights and views. You will be scored (a total of 25 points max) on the following sections with the mentioned sub-grades (and intermediate grades as well): - (6) Readability, Structure, Presentation o Poor (1), Fair (3), Very Good (6) - (10) Technical relevance and correctness o Poor (1), Fair (4), Very Good (10) - (9) Completeness (coverage), Explanation and reasoning o Poor (1), Fair (4), Very Good (9) Should you exceed the word count 15% of your final mark will be deducted for each 100 words over the limit (so, for e.g. 1,510 words in total you would receive a penalty of 15% of the total mark gained, for 1,610 words, 30%, etc.). This might seem drastic, yet we must simply make sure that you get to a conclusion and a marker must read your submission as well. The grading is intentionally kept simple to improve marking and elaboration of marks. Poor: Something has been done in the section, yet not enough Fair: Moderate average, nothing too special, yet quality work Very good: A very good amount of work was done, and the results are corresponding
Module code and Title msSchoolofAIandAdvancedComputingCourseworkSunday,March23rd23:59(UTC+8Beijing)
ACP Assignment 2 Specifications Programming In this assignment you are supposed to implement a service which provides several endpoints to communicate with the AcpStorageService (BLOB), Kafka, RabbitMQ and redis The auto-marker will call your endpoints using PUT, POST and DELETE accordingly. Due to passing data in JSON-format in the body of a message, we only have classical GET methods, where no JSON-data is needed as a GET request is not supposed to be using the body of the message. Your main tasks can be summarized as follows (no changes from assignment 1): 1. Create a Java-REST-Service - Preferred with Spring Boot, though other frameworks can be used as well - Port 8080 is consumed - Implement the endpoints - Proper parameter handling - Proper return code handling - JSON handling 2. save the docker image in a file called acp_submission_image.tar 3. place the file acp_submission_image.tar into your root directory of your solution Your directory would look something like this: acp_submission_2 acp_submission_image.tar src (the Java sources … ) main … … 4. Create a ZIP file of your solution directory - Image - Sources - IntelliJ (or whatever IDE you are using) project files 5. upload the ZIP as your submission in Learn For the service you can either use your own implementation or the template provided in Github at https://github.com/mglienecke/AcpCw2Template.git This repository is in constant development, so use it as a basis, but not as a fixed, never changing element. The main factors for using it are the ability to have the actuator endpoints (see lecture of today) and some general housekeeping tasks (like creating an environment, etc.). General information: • All mandatory connection information to subsystems will be passed as environment variables, or – if explicitly specified – in the endpoints. • The properties used in all Kafka-examples, which are not passed in as environment variables or in the request as JSON-data, can be assumed as constant (see the example in the repository). This includes things like the offset, the type of serializer, etc. • So, you must: o Read the necessary environment variables o Additionally for some requests handle additional data • All JSON passed in will be always in the syntactical correct format, so you can ignore error handling there. What you still must check is that the data gives you a proper connection and no wrong address or invalid user / password is passed (so classical flow errors in an application). • No knowingly invalid data will be passed • Parameters given as { … } in the task will be replaced at runtime with the corresponding value. So, {queueName} could become s12345678_outbound_queue • The points after a task are the maximum achievable points for the individual task • The auto-marker will not force to produce 500 codes, yet should a 500 arise this is a clear indicator that you didn’t catch an exception … • All endpoints are expected to deliver a 200 response to indicate success • No prefix (like /api/v1) is to be used. Every endpoint must be reachable from the bound root Predefined environment variables The following environment variables will be available: - REDIS_HOST (e.g. host.docker.internal) - REDIS_PORT (e.g. 6379) - RABBITMQ_HOST (e.g. host.docker.internal) - RABBITMQ_PORT (e.g. 5672) - KAFKA_BOOTSTRAP_SERVERS (e.g. kafka:9093) - ACP_STORAGE_SERVICE (e.g. https://acp-storage.azurewebsites.net/) The AcpStorageService can be investigated using https://acp- storage.azurewebsites.net/swagger-ui/index.html For Kafka, RabbitMQ and redis you can assume a local dockerized environment for the auto-marker. So, you can user either something similar, or confluent.io (you have the registration information), but then you will have to set some security information like security protocol, SASL mechanism and JAAS configuration). The auto-marker uses no authentication, so any present authentication in the connection properties will cause a failure. Do not (!) use the Kafka-Admin-API, as this will not be necessary and wouldn’t be available in a larger environment. The mark distribution will be like: - PUT RabbitMQ (3) - GET RabbitMQ (3) - PUT Kafka (3) - GET Kafka (3) - POST processMessages (17) - POST transformMessages (16) For the 2 larger items marks for sub-achievements will be given. The simpler tasks are "all or nothing" (so, 3 or 0 points), unless repeat errors. This assignment has many subtle details. Please check every exactly as otherwise you will lose points. The REST-Service must provide the following endpoints: • (3) PUT rabbitMq/{queueName}/{messageCount} Write {messageCount} messages into the queue defined by {queueName}. Each message has to be the following JSON-format: { “uid”: “replace with your student id”, “counter”: numerical index of the message starting at 0 } • (3) GET rabbitMq/{queueName}/{timeoutInMsec} Return the data read from the topic {queueName} as a List from the service – one entry being one message. You read until {timeoutInMsec} has passed. If you stay longer than that plus 200 msec in the routine, this will considered a fail. So, if the timeout is 1000 msec, you must return after 1,200 msec the very latest. Normally your timeout will be quite small (around 100 msec) • (3) PUT kafka/{writeTopic}/{messageCount} Same as above with rabbitMQ, just now into a kafka topic • (3) GET kafka/{readTopic}/{timeoutInMsec} Same as above with rabbitMQ, just now a kafka topic and the timeout will be a bit larger (around 500 – 5000 msec). • (17) POST processMessages in the call you will receive additional JSON body data in the following format: { "readTopic": "topicname", "writeQueueGood": "queuename", "writeQueueBad": "queuename", "messageCount": value (between 1 and 500) } The overall system diagram can be depicted as: Your task is to read {messageCount} JSON-data items (no time constraint) from the kafka topic (reset your offset to the beginning!) in the following format: { "uid": "your UID in the format sXXXXXX", "key": 3-, 4- or 5-character long character sequences as string, "comment": a string, "value": floating point value } If the key is 3 or 4 characters long, this is a "good" packet and you store the data packet it in the AcpStorageService. Before you can do this, you have to add a field to the JSON: "runningTotalValue": running total of all good message values to this point So, you are writing the original message with the running total as a new field in the JSON object to the ACP storage service. The returned UUID from the service is added to the original JSON object (as field uuid in the JSON object – thus like “uuid”: “ 12121-xxx …”) which you read and the whole is written to the {writeQueueGood}. For "bad" packets, you write the message directly (so, without storing!) to the {writeQueueBad}. After you are finished, you write a new packet like above with a key "TOTAL" to the good and bad queue, where you put as a value the added packets for the corresponding queues. The comment is not relevant (but must be set) and no UUID must be added to this message as no data was written. Example: - You receive 3 messages AAA, ABCD and ABCDE with a value of 10.5 each - 2 write 2 messages to the store (as good) i. Increment the running total accordingly (10.5, 21.0) - You write 2 messages (with the corresponding UUID) to the good queue - You write 1 message (no store) to the bad queue - You write a TOTAL message with 21 to the good queue (comment "") - You write a TOTAL message with 10.5 to the bad queue (comment "") - Your UID is just copied along • (16) POST transformMessages In the call you will receive additional JSON body data in the following format: { "readQueue": "queuename", "writeQueue": "queuename", "messageCount": value (between 1 and 500) } Your main task is to read {messageCount} messages (no time-limit) from the readQueue (rabbitMQ), process them and write them to the writeQueue. Each message can be in 2 formats – "normal" and "tombstone". "Tombstone" messages are often used to synchronize something or to mark an end to a process, etc. (thus the name … ). The "normal" format is: { "key": any string, "version": integer (1..n) "value": any float } The "tombstone" is: { "key": any string, no additional data … } For normal packets you have to check if the key in this specific version is already present in redis. If yes, you just pass the packet 1:1 to the writeQueue without processing. If not present, or in an older version (so redis version < current version), then you store the entry in redis and pass the packet with a value increased by 10.5 to the writeQueue. for tombstone packets you remove the key from redis and act like that key has not been set before. You are writing a special packet to the outbound queue: { "totalMessagesWritten": integer, "totalMessagesProcessed": integer, "totalRedisUpdates": integer, "totalValueWritten": float, (the total of all packets up to now) "totalAdded": float (the total of all 10.5 you added up to now) } These are the current running values up to the moment until the tombstone was hit. Tombstones can occur several times! Example: - Receive "ABC", Version 1, Value 100 i. Store in redis ii. Write "ABC", Version 1, Value 110.5 to the out queue - Receive "ABC", Version 1, Value 200 i. Write "ABC", Version 1, Value 200 to the out queue - Receive "ABC", Version 3, Value 400 i. Store in redis ii. Write "ABC", Version 1, Value 410.5 to the out queue - Receive "ABC", Version 2, Value 200 i. Write "ABC", Version 2, Value 200 to the out queue - Receive "ABC" -> Tombstone i. Clear redis ii. No write - Receive "ABC", Version 2, Value 200 i. Store in redis ii. Write "ABC", Version 2, Value 210.5 to the out queue (as new!) So, you can receive out of sync (older version) packages as well The following should be considered when implementing the REST-service: • Do proper checking – all data will be valid, yet still you have to check some things (like return codes, exceptions, etc.) • Your endpoint names have to match the specification • Test your endpoints using a tool like Postman or curl. Plain Chrome / Firefox, etc. will do equally for the GET operations • The filename for the docker image file has to be exactly as defined as well as the location of it in the ZIP-file. Should you be in doubt, use copy & paste to get the name right