YEAR 2022-23 MODULE CODE: GEOG0093 MODULE NAME: Conservation and Environmental Management COURSE PAPER TITLE: Individual Grant Proposal - NERC WORD COUNT: 1998 Investigating the interaction between Irrawaddy dolphin (Orcella brevirostris) and Fisheries in the Eastern Gulf Coast of Thailand Project Summary The Irrawaddy dolphin (Orcella brevirostris) is a coastal and freshwater mammalian species found sparsely distributed throughout the tropical and subtropical Indo-Pacific (Jackson-Ricketts, 2016). Their “Endangered” classification on the International Union for Conservation of Nature’s (IUCN) Red list (Minton et al., 2017) is largely from anthropogenic threats, particularly those relating to fisheries (Nelms et al., 2021). In Thai waters, over exploitation of fish stocks is an ongoing issue causing reductions in fishery production & biomass (Wongrak et al., 2021), all of which can impact fish-eating mammals such as the O.brevirostris. To date little research has been done to quantify resource overlap between O.brevirostris prey and fisheries. Across a one-year period, this study will evaluate prey resource and spatial overlap between O.brevirostris sub populations in the Trat Province of Thailand, and fisheries. The results of which could support ecosystem-based fishery management in the future, with the intention of improving O.brevirostris populations as well as fishery catch yields. Case for support Part 1 – Literature Review O.brevirostris are a cetacean species sparsely distributed across coastal waters of the subtropical and tropical Indo-Pacific (Jackson-Ricketts, 2016; Tubbs et al., 2020). As with other cetaceans they are considered important top down controllers of their marine environment primarily by the function of predator-prey relationships (Jackson-Ricketts, 2016; Ricci et al., 2021). Moreover, they have a role in the modification of benthic habitats as well as nutrient recycling and are considered essential players in a marine ecosystem’s health and integrity (Bowen, 1997; Nelms et al., 2021). O.brevirostris marine populations are preferentially distributed within shallower coastal waters in close proximity to shores & river mouths (Minton et al., 2011; Minton et al., 2017). This association is thought to be driven by freshwater input and prey distributions (Minton et al., 2017). Currently listed as “Endangered” on the IUCN Red List of Threatened Species, the close association of the O.brevirostris to the coast has left them vulnerable to a host of anthropogenic issues including but not limited to; bycatch, shipping vessel collisions, and prey overfishing (Jackson-Ricketts, 2016; Parsons et al., 2015). For example, Minton et al. (2011) in the Kuching region of Malaysia saw 50% of O.brevirostris sightings within less than 50 metres of fishing vessels, increasing their risk of direct interactions (Minton et al., 2011). Between January & February 2013, the death of fourteen O.brevirostris as a result of fishing gear entanglement near Trat province in the Gulf of Thailand (Hines et al., 2015) further highlights this issue. Overlap between cetaceans and fisheries is not globally problematic; but rather restricted to certain regions and species (Kaschner & Pauly, 2005). The Eastern Gulf coast of Thailand is a problematic area for sub-populations of O.brevirostris, however to date little research has been done on the relative overlap (Nelms et al., 2021). According to the United Nations (UN) Food and Agriculture Organisation, Thailand’s Exclusive Economic zone (EEZ) comprises 41% of the marine catch within the Gulf of Thailand (FAO, 2013). Thailand’s fishing industry is commercially significant offering 2.4million tonnes to global fishery production and keeping almost 250,000 people employed annually (Kulanujaree et al., 2020). Commercially significant species include the Indo-Pacific mackerel (Rastrelliger spp.) which encompasses around 41% of catches; as well as anchovies (Encrasicholina spp. and Stolephorus spp.), round scads (Decapterus spp.) and ponyfishes (Leiognathus spp.) (FAO, 2013; Klangnurak & True, 2022). Over exploitation of fish stock within Thai waters is an ongoing issue. Due to overfishing, across a 10 year period Thailand saw a 39% reduction in fishery production (Wongrak et al., 2021). Although this change was partially stemmed by reductions in illegal fishing since 2015; evidence of over exploitation is still present, such as with the decreasing trends in Indo-Pacific mackerel length from 18cm to 15cm (FAO, 2013). Previous stomach content analysis of O.brevirostris subpopulations in the Eastern Gulf Coast of Thailand suggested their diet mainly piscivorous with a preference of ponyfishes, mackerel, and scad (Jackson-Ricketts, 2016). Although this is suggestive of resource overlap, few studies have quantified the relationship within this region and thus further research is warranted. Part 2 - Description of the Proposed Research Rationale Despite O.brevirostris being legally protected (Partnership, 2010); their population is still on the decline having moved from “Vulnerable” to “Endangered” on the IUCN Red List (Minton et al., 2017) suggesting current conservation approaches are insufficient at stemming it. The death of only 4.2 individuals a year across a period of 60 years (3 generations) would result in a further 50% decline in population size (Moore, 2015). With sub populations generally ranging from 10 – 100 in size; mass mortality events as a result of fishery interactions such as the aforementioned death of fourteen O.brevirostris in the Trat Province of Thailand (Hines et al., 2015) far exceed this protectory. Furthermore, decrease in the biomass of fish stocks due to overfishing not only affects commercial fisheries but sub-optimal prey could affect O.brevirostris growth and reproduction further exacerbating their decline (Minton et al., 2011). Therefore, it is essential a proactive data led conservation stance is taken to avoid this species extinction. Data scarcity has thus far limited ecosystem-based fishery management approaches (Cheng et al., 2022). In other marine ecosystems; increases in marine mammal populations when not monitored alongside the continuing growth in commercial fisheries has led to further damage of fish stocks (Jusufovski et al., 2019). In order for effective conservation to take place; it is essential to have a better understanding of the overlap between fisheries and the O.brevirostris. With projects currently underway by the UN Environment Programme that look to integrate conservation in to fishery management through the creation of “Regional System of Fishers Refugia in the South China Sea and the Gulf of Thailand” (Siriraksophon, 2022), now is the time to ensure sufficient data is available to guide these projects. Research objectives and outcomes 1. To analyse the O.brevirostris prey consumption rate and prey composition within the Trat Province, Thailand, by stomach content analysis of stranded animals 2. To establish resource overlap by comparison of the O.brevirostris prey consumption rate to fisheries catch rate 3. To compare the spatial overlap between O.brevirostris habitat (through use of electronic tags) to local fishing grounds, by tracking fishing vessels in the area Methodology and approach The Gulf of Thailand is a continental shelf bordering Thailand, Cambodia, Malaysia & Vietnam (between latitudes 5。00’ and 13。30’ N and longitudes 99。00’ and 106。00’ E) (Jackson-Ricketts, 2016). This research will focus on previously sighted (Ponnampalam et al., 2013) O.brevirostris populations in the Trat Province, an area within Thailand’s EEZ (see figure 1). Figure 1: Map highlighting the Trat Province (in yellow) within the Gulf of Thailand. Adapted from Hines et al. (2015). To understand the composition of the O.brevirostris prey, the stomach content of stranded animals will be collected and analysed. Stranded dolphins will be located through use of the Phuket Marine Biological Centres (PMBC) stranding network. Any deceased dolphins will have their stomach removed and frozen for laboratory analysis. Once thawed stomach contents will be sieved to separate hard and soft tissues. Fish bone & otoliths will be dried; whilst cephalopod beaks will be stored in ethanol. Identification will be done using published guides (Giménez et al., 2017). The relative importance of each prey type to the Irrawaddy will be quantified using equations in established in Giménez et al. (2017) based on proportion of prey and prey type against the total number of prey found within the stomach. Prey weight will be reconstructed by standard regressions. Overall annual consumption rate will be calculated using methods established by Santoset al. (2014) where O.brevirostris population size will be estimated based on an averages of the area found in Hines et al. (2015). To established the geographical range of the O.brevirostris bio-tagging will be administered by modified methods from Wells et al. (2021). In brief, personnel and boat resources from PMBC will assist in the tagging process. Capture sites will be within shallow depths (less that 3m). Once spotted, O.brevirostris would be captured by encirclement using seine nets before being transferred to boats. Staff will be nominated to monitor in case of any entanglements; and to ensure they are kept wet with sponges once on the boats. A veterinarian will be present to continuously monitor captured animals. SPLASH10 tags will be attached to dorsal fins by a single pin to transmit location data via satellite. As fishing is carried out both during the day and night in this region (Noranarttragoon, 2014) it is essential a full day is covered so tags will be programmed to transmit Global Positioning System (GPS) data every 15 minutes (Hayset al., 2021). Data would be collected using the Argos Data Collection and Location System. For sample size to be a significant representative of the population in this region 10 individuals will be tagged in each deployment biannually. By Thai law commercial fishing vessels must register with Department of Fisheries (DOF) (Kulanujaree et al., 2020). To establish the geographical range and movements of fishing vessels in the Trat region geographical data will be collected from the Marine Fisheries Research and Development Division within the DOF (Tepparos et al., 2022). For Vessels ≥30 GT that must be fitted with a vessel monitoring system (VMS), this will take the form. of hourly timecoded coordinates transmitted to the DOF (Tepparos et al., 2022). Vessels between 10 to
Assignment #5 - Data from the Web, Combining Data Click on this link: https://classroom.github.com/a/zYnA1PBu (https://classroom.github.com/a/zYnA1PBu) to accept this assignment in GitHub classroom. This will create your homework repository. Clone your new repository. In this homework, you'll: 1. "Screen scrape" data from a website o Using requests and BeautifulSoup to download and parse html o Use merge to join DataFrames by a key 2. Work with JSON / Use an API o Use the json module with requests Part 1 - Combine Course Info with Requirements ASCII art source(http://www.oocities.org/spunk1111/school.htm), with modifications Prep In this part of the assignment, you'll work on parsing html with a library and regular expressions to extract course information data. Additionally, you'll use the merge function to put together data from two different DataFrames based on key. 1. Download the course schedule for this semester (https://cs.nyu.edu/dynamic/courses/schedule/) by right-clicking and choosing "Save As". Save this in the root of your project repository; give it a short, but descriptive file name. 2. Download the course catalog(https://cs.nyu.edu/dynamic/courses/catalog/) by right-clicking and choosing "Save As". Save this in the root of your project repository; give it a short, but descriptive file name. 3. Make sure to install any modules necessary for working with html 4. Open up the empty notebook, courses.ipynb to work on this part of the assignment Instructions 1. Read the course schedule into a DataFrame. · the frame should have the following columns: o Number-Section: the course number and section number Use your discretion to deal with issues encountered here (for example, some issues may include courses listed with two different course numbers!), but document what you've done and give some rationale for your methodology n ⚠ you may encounter an invisible space character (a zero width space) in the course number depending on how you extract the text (view the markup in Chrome's web inspector tools or try printing it out in Python) the easiest way to deal with this is to replace it with emtpy string (assuming s contains the zero width space): s.replace( 'u200b ', ' ') o Name: the name of the course o Instructor: the name of the professor Again, there maybe issues here, such as multiple instructors; use your discretion, but describe what you've done and why o Time: the day(s) and time(s) the course meets o (once you read in the data, you'll add a couple of rows) · this is what a portion of the DataFrame may look like (note, the courses are from a past semester, though) · once you've read in your data, break apart the Number-Section column into two separate columns: Number and Section o Number is something like CSCI-UA.0480 o Section is something like 001 。 try to use regular expressions with groups to do this the str accessor method to use is extract check out the end of the regex slides(../slides/python/regex.html) or the official pandas docs (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extract.html) · show: 。 info to show the data types and counts 。 the first 5 rows 。 the last 5 rows 。 a random sampling of 5 rows General Workflow and Hints · read in the html file · parse the html · pick out /extract the data · it's very useful to use your browser's web inspector tools (right-click on element and inspect to see html and parent, sibling, and child elements · ⚠ see this guide on using web inspector tools for chrome (https://developers.google.com/web/tools/chrome- devtools/dom/) · note that in the parsing library we cover in the slides … · select , select_one , find , etc. can be called on an element to find elements nested within it 。 for example, if my_div is the result of calling select_one 。 … and my_div is onetwo 。 …you can select elements within my_div with my_div.select('p ') · if you just want the first nested element o you can dot ( . ) the parent element o … and use the name of the nested element next o for example if, my_h1 is foo bar o my_h1.a can be used to access the nested a · if a collection of elements is returned, you can index into it with [] (essentially, a list) · getText() , .text or .get_text() returns all of the text within an element (even nested elements!) · pay attention to patterns in data (what makes a course number, what element is a course number usually in?) · also helpful to printout elements themselves (without using .text ) to see what element was actually selected · some example markup and parsing code: Course Name: CS-123 Alice Ahn ' Course Name: CS-456 Bob Bernstein # get every element in the section element with class container # that has the class attribute, row rows = rom.select('section.container .row') for row in rows: # looping over this selection gives us each div element # within each div element, find the paragraphs paragraphs = row.select('p') # show ALL text in first paragraph in current div print(paragraphs[0].text) # on 1st iteration: Coure Name: CS-123 # dotting an element with a tag name retrieves the # first nested element with that tag name print(paragraphs[0].a.text) # on 1st iteration: CS-123 print(paragraphs[1].text) # on 1st iteration: Alice Ahn 2. Read the course catalog into a DataFrame. · the frame should have the following columns: 。 Number: the course number 。 Prereqs: a text description of the prerequisites 。 Points: the number of credits · here's example of a DataFrame. with some course catalog rows (again,the data is from another semester) · show: 。 info to show the data types and counts 。 the first 5 rows 。 the last 5 rows o a random sampling of 5 rows · use a similar parsing strategy as above to read in this DataFrame. 3. Put together both DataFrames · create a new DataFrame by … . · finding a way to show all scheduled classes in the semester along with their points and prereqs · only show the following columns, in this order: o Number: course number o Name: course name o Instructor: professor's name o Time: meeting time o Prereqs: course prerequisites o Points: number of credits · hints: use pd.merge to do this (../slides/python/pandas-join-combine.html) o how=left will keep all rows in the first DataFrame. 4. Conclusion · did you spot any anomalies, discrepancies, or unexpected data or relationships between data? · if so, in a markdown cell, describe any problem(s) you saw · additionally, describe how you might fix them (or if you already fixed them!) · lastly, based on the resulting DataFrame , describe the behavior of how=left on these particular DataFrames · if you need to see all rows, use pd.set_option('display.max_rows ', 200) Part 2 - Using an API Overview In this part of the assignment, you'll request data from a server in json format, parse it, and load it into a DataFrame . Using this DataFrame you'll use aggregations to produce a report. The data set is composed of films from the Japanese animation film studio, Studio Ghibli (https://en.wikipedia.org/wiki/Studio_Ghibli). It is being served from a mirror of the data on linserv1.cims.nyu.edu . Note, however, that the original data is from https://ghibliapi.herokuapp.com/(https://ghibliapi.herokuapp.com/), which is under an MIT License (https://github.com/janaipakos/ghibliapi/blob/master/LICENSE) . This is mirrored so that we do not overwhelm the original data source with requests. Instructions The goal of the assignment is to create a report showing director's names, the number of Ghibli films that the directors was involved in, and the average rottentomatoes score of the Studio Ghibli films made by that director. The expected output is shown below: 1. Retrieve the data, and examine it. · In `films.ipynb', programmatically retrieve one page of json from this URL: http://linserv1.cims.nyu.edu:10000/films? _page=1(http://linserv1.cims.nyu.edu:10000/films?_page=1) · You can use requests to do this 。 you can use the json module to manually parse the response content 。 Or … . use a feature of the requests module that allows immediate parsing of a json response by calling the json() method o r = requests.get('some.url') o d = r.json() # parses json into dictionary! · Examine the keys and values of the dictionary · In a markdown cell,write out what keys you maybe interested into create the report specified above · Try incrementing the last number in the url where page is 1 … do you get different results? · In a markdown cell, describe what happens when you modify the url 2. Load the data into a DataFrame 1. Make a request to http://linserv1.cims.nyu.edu:10000/films?_page=1(http://linserv1.cims.nyu.edu:10000/films?_page=1) again, but this time, load the result into a DataFrame. 2. Continue collecting additional data and addingto the DataFrame. until there is no more data to retrieve 3. Report Create a report that shows: · the directors' names as the index (Note that the index.name can be set to get what appears to be a title for the index (../slides/python/pandas-basics.html#62)) · the average rottentomatoes score (review aggregator website) · the number of films directed · concat and groupby maybe helpful
Main Examination Period 2021 - January - Semester A - Timed Examination Module Code and Title: BUS096 Relationship and Network Marketing Date of exam: 7th January 2021 Question 1 ● How does the dark side of close relationships influence B2B relationships? [25 marks] ● Use a tension-based view to explain the dark side of business relationships and explain how tensions affect B2B relationships negatively. [25 marks] Question 2 ● How does Social Exchange Theory (SET) explain the mechanisms underlying business relationships? [30 marks] ● What are the weaknesses of SET? Provide one example for each. [20 marks] Question 3 ● How does Transaction Cost Economics (TCE) explain the mechanisms underlying business relationships? [30 marks] ● What are the weaknesses of TCE? Provide one example for each. [20 marks] Question 4 ● What firm behaviours in business relationships does the Resource Dependence Theory (RDT) suggest, and why? [30 marks] ● What are the weaknesses of RDT? Provide one example for each. [20 marks]
MMM147 Measuring Marketing Performance 1. What is the Purpose of This Assessment? The following table shows which of the module learning outcomes are being assessed in this assignment. Use this table to help you see the connection between this assessment and your learning on the module. Module Learning Outcomes Being Assessed Assessed learning outcomes relate to the point 9 in the Module Description. The particular assessable learning outcomes are: 1. Distinguish between assets, liabilities, equity, income and expenses. 2. Understand the layout of the three financial statements and differentiate between these: income statement (profit and loss), statement of financial position (balance sheet) and cashflow statement and the importance of Current Year / Prior Year (CY/PY) analysis. 3. Calculate a simple income statement. 4. Understand the drivers and consumers of cash. 5. Develop ability to undertake preliminary research prior to analysing company accounts. 6. Distinguish between, and calculate, a vertical and horizontal analysis of financial information and interpret the data. 7. Prepare simple financial ratios and use to interpret company financial information. 8. Interpret financial statements and how to use them to make marketing investment decisions. 9. Understand how income statements can be used across products, brands, business segments and consolidated companies. Additional learning outcomes: To gain confidence as non-financial specialists in using financial statements for making strategic and tactical business decisions. To gain further skills in fact based decision making; presentation, summary and visualisation of data; uncertainty of future outlooks and associated risk management. 2. What is the Task for This Assessment? To produce a report on the financial performance and position of a real company by analysing its financial statements. 3. What is Required of Me in this Assessment? Task (see detailed ‘What to include in the report’ below) The task is to demonstrate your skills in interpreting the 3 financial statements of an existing public company* operating anywhere in the world** and using the data to draw conclusions on past actions, future strategy, and tactics, including possible scenarios and associated risks. *To choose an organisation, it is advisable to think of a sector/industry you are most passionate about first and only then select a company **All company data relevant to the assignment must be in English What to include in the report? Element Details %* 1. Give background to the organisation and their strategy (LO5) · Describe the organisation and the industry/sector it operates within. · Give an overview of products/services made/delivered and its key competitors. · Explain key challenges (e.g., operating constraints) and strategic objectives of the organisation by reviewing executive comment and commentary from other external sources (e.g., Financial Times). 15% 2. Evaluate the company performance (LO1, LO2, LO3, LO7, LO9) Evaluate how well the company is performing by analysing its annual report. You should: · Analyse the profit and loss and balance sheet using current year/prior year trend analysis. · Calculate and comment on financial ratios, using the ratios where appropriate to evaluate different segments of the business e.g., across brands, sub-businesses, countries (where appropriate). [Calculations along with formulas to be in an appendix] · Consider performance against a “benchmark” organisation (ideally in the same sector) or an industry average. 40% 3. Calculate vertical and horizontal analysis of the financial statements and interpret the data (LO1, LO2, LO3, LO6) With particular focus on the Income Statement, calculate the horizontal and vertical analysis and present a summary of the key findings. Comment on comparison across Current and Prior Year(s). [Calculations to be in an appendix] 15% 4. Comment on the drivers and consumers of cash (LO4) Starting with cash from operations, identify the cash generating and consuming areas from the cash flow statement and discuss trends when comparing to previous year. 10% 5. Construct a forecast income statement for the chosen organisation (LO2, LO3, LO4) Build a forecast income statement and explain your assumptions in revenue and expense build ups and gross profit expectations. Compare and contrast the key financial ratios for your forecast compared to current performance. 15% 6. Discuss key challenges for the company and strategic marketing options (LO8) Use your forecast income statement and findings of your strategic analysis to assess the key financial challenges in delivery of profit going forward. Outline how you would propose to allocate the marketing expense across the different business segments. 5% * Word count in % – this is only approximate word count
NSP655 - Lab Assignment 3 Configuring DHCP and DNS servers in Linux In this lab, we will examine how to configure Dynamic Host Configuration Protocol (DHCP) and Domain Name System (DNS) server services under Linux. PART 0: Configuring vNICs on the guest VMs for host-only mode § On the Linux server VM, add an additional virtual NIC by selecting VM – Settings from the VMware WS main menu. Click on Add at the bottom of the window, select Network Adapter and click Finish. On the right-hand side, select the Host-only mode for this network connection and click OK. This will configure the vNIC using a separate Host-only virtual network (VMnet1) in VMware on the host OS. § In older versions of VMware WS, the newly added adapter (ens224) may have a separate IP address given by virtual DHCP. To shut down the virtual DHCP server (since we will set up our own DHCP service on the Linux server), select from the VMware WS main menu Edit – Virtual Network Editor. Click Change Settings on the bottom and click Yes to allow VMware to make changes (NOTE: at this point, the Virtual Network Editor window may disappear from the foreground, just minimize the VMware WS application to reveal the window). Next, select VMnet1 (Host-only) from the top pane, and uncheck the box at the bottom of the window (Use local DHCP service to distribute IP address to VMs). Note that, if this box is already unchecked (VMware WS 17 default), then there is no need to make any changes. Click OK to save the settings and then bring your Fedora 38 VM back to the foreground again, if necessary. § Open a terminal and type the command ifconfig. You should see a new network interface (e.g. ens224) that is connected to the Host-only virtual network. This is a virtual interface in the guest VM; we will assign a static IP to this interface in the next part. § Similarly, add an additional host-only vNIC for the Linux client system and the Windows 10 system (for which you will see a new Intel 82574L NIC appear as Ethernet1). PART 1: Configuring the Dynamic Host Configuration Protocol server service (DHCPD) Follow the procedure outlined below to set up and test a DHCP server. You will need to configure both the Linux server VM and the Windows 10 VM to complete this part. § First, we will assign an IP address in the 192.168.100.0/24 network to the newly added vNIC on the Linux server VM. To do this, click the top right of the desktop (network/volume/ power icon) and click on the arrow next to Wired (blue colour). You will notice that Ethernet (ens224) is not connected. Click on Wired Settings below this and click the small gear icon on the right edge of the Ethernet (ens224) connection/speed field to configure the settings. From the selection area at the top, click IPv4, select the Manual IPv4 Method, specify the Address as 192.168.100.10 with Netmask 255.255.255.0 (leave the Gateway blank), specify the DNS address as 127.0.0.1 (disable the Automatic DNS mode) and click Apply at the top of the window. Use the toggle control to activate Ethernet (ens224). Close the Network Settings window and check to see if the virtual interface now has been assigned an IP address, using the ifconfig command. If not, click the top right of the desktop (network/volume/power icon) again, click on the arrow next to Wired (blue colour) and click Ethernet (ens224) to connect. § Once the VM is restored and you log into Windows, a new Intel 82574L Gigabit Network Connection should be detected automatically. This will be the Ethernet1 connection (verify this by navigating to File Explorer – right-click on Network then select Properties – select Change adapter settings on the left). This network interface does not have an IP address yet but will obtain one automatically from the Linux server VM once we have configured a DHCP server. § Back at the Linux server VM, since the DHCP server package is not installed on the system (to see this, type rpm -qa | grep dhcp at the command prompt), we will start by installing this package. Open a terminal window, switch user to root and issue the command dnf install dhcp-server. Type y and press enter when prompted to download the package. § To set up the DHCP server we first must configure the file /etc/dhcp/dhcpd.conf and then start the service. Open a terminal window, switch user to root, and back up the existing configuration file by typing cp /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf.bak. Now copy the sample dhcpd configuration file to /etc/dhcp by typing cp /usr/share/doc/dhcp-server/dhcpd.conf.example /etc/dhcp/dhcpd.conf. Answer yes, if the system asks you to overwrite the existing file (we already have a backup). MAKE SURE YOU TYPE THIS COMMAND CAREFULLY! § Using the gedit editor, edit the /etc/dhcp/dhcpd.conf file and make the following changes: 1. At the top of the file, set the domain-name global DHCP option to nspdomN.local (do not remove the quotes; N is your unique number), and comment out the domain-name-servers global DHCP option (by inserting a # character in front of the option). 2. Near the top of the file, uncomment the line containing the word authoritative. 3. Delete the first “subnet” section (2 lines); edit the second “subnet” section and specify the network subnet as subnet 192.168.100.0 netmask 255.255.255.0 { and the DHCP address range as range 192.168.100.50 192.168.100.100; which is consistent with the subnet above. 4. In the subnet section above, comment out the routers local DHCP option (we will set this later for the final project). 5. Delete all remaining “subnet” and “host” sections, leaving only the section for host fantasia { … }. For this section, comment out all the corresponding statements, including the brackets ({}) and the word fantasia. Also, delete any lines listed after this entry. 6. Save the file and exit. § From a terminal window as user root, start the DHCP server service by typing systemctl start dhcpd.service. If you don’t see any error messages, the service should now be running (verify this using the same command with the status option); if there is an error message, check the dhcpd.conf file for syntax errors like missing semicolons or end brackets. You can also check the dhcpd status log for any useful hints as to what went wrong. § To test this out, disable and then re-enable the Ethernet1 interface in the Windows 10 VM, to force it to obtain an address from the DHCP server running on the Linux server VM. Verify the IP address obtained by opening a command window and typing ipconfig. You can also verify the DHCP server operation by checking the dhcpd status log (look at the last few lines of output). WHAT TO SHOW/SUBMIT: Show (capture) terminal window on Linux server with output from ifconfig command, after new vNIC is added and IP address is configured. Show (capture) gedit window showing contents of dhcpd.conf configuration file. Show (capture) terminal window on Windows 10 VM showing output of ipconfig command with IP address and domain name obtained automatically from Linux server. 3 captures total. PART 2: Configuring the Berkley Internet Name Domain (BIND) DNS server service (NAMED) Boot your system into Fedora Linux and follow the procedure outlined below to configure the BIND DNS server service. You will need to use the two Fedora VMs to complete this part. § Before we start, we will prepare the Linux client VM, i.e., the cloned F38 VM from lab 1. Log into the client VM and complete the first step from part 1 above to assign an IP address of 192.168.100.20 with mask 255.255.255.0, and a DNS server address of 192.168.100.10, i.e., the address of the Linux server VM that will be running the DNS server (do not forget to disable the Automatic DNS mode). Finally, disable the primary (ens160) interface; that is, click the top right of the desktop (network/volume/power icon), click the arrow next to Wired and click Ethernet (ens160). (We do this to avoid having the College/ISP DNS server appear in the system configuration. DO NOT FORGET TO DISABLE THIS!) § Boot into the Linux server VM that you used to configure DHCP in part 1 above. Open a terminal and switch user to root. Issue the command dnf install bind to install the BIND server package as well as other supporting packages. When prompted type y and press enter to complete the installation. § To set up the DNS service, we first need to make a few changes to the BIND configuration file. To edit the file issue (as root) the command gedit /etc/named.conf. Near the top of the file make the following changes under the options section: 1. In the first line, listen-on port option, append 192.168.100.10; (including the semicolon) after 127.0.0.1 to allow the server to listen for requests on the virtual network interface in addition to the loopback address. 2. In the line that contains the allow-query option, append 192.168.100.0/24; (including the semicolon) after localhost to allow the server to accept queries from the entire virtual network subnet in addition to local queries. Near the bottom of the file, after the zone “.” IN declaration, add the following sections to define a FORWARD zone called nspdomN.local and a REVERSE zone for the 192.168.100.0 address space: zone "nspdomN.local" IN { type master; file "nspdomN.local.db"; notify NO; }; zone "100.168.192.in-addr.arpa" IN { type master; file "192.168.100.db"; notify NO; }; Make sure not to make mistakes (e.g., N should be your unique number, not the letter) and, when done, use the command named-checkconf /etc/named.conf to check the file for errors. If there is no output, the file is OK, otherwise fix the indicated syntax errors and try again. § To add IPv4 address (A) records and reverse pointer (PTR) records to the newly created zone, we need to create the zone files nspdomN.local.db and 192.168.100.db specified in named.conf above. To do that, first switch to the /var/named directory as root. Then create the first file using the command gedit nspdomN.local.db (N is your unique number) with the following contents: ;zone "nspdomN.local" ; $TTL 1H ; @ IN SOA localhost. root.localhost. ( 1 ;serial 3H ;refresh 1H ;retry 1W ;expire 1H ) ;caching TTL @ IN NS localhost. ; F38server IN A 192.168.100.10 F38client IN A 192.168.100.20 Use the command named-checkzone nspdomN.local.db /var/named/nspdomN.local.db (N is your unique number) to check the file for syntax errors. If an error is reported, correct it and try again until the command returns OK. To create the second file use gedit 192.168.100.db and add the following contents: ;zone "100.168.192.in-addr.arpa" ; $TTL 1H ; @ IN SOA localhost. root.localhost. ( 1 ;serial 3H ;refresh 1H ;retry 1W ;expire 1H ) ;caching TTL @ IN NS localhost. ; 10 IN PTR F38server.nspdomN.local. 20 IN PTR F38client.nspdomN.local. To avoid excessive typing, you can use copy and paste, or copy the first file to create the second one and change the required lines. Use a similar command as above to check for syntax errors and correct as necessary. Now start the DNS service by issuing the command systemctl start named.service as user root. Note that you will have to restart the service every time you make a change to the DNS configuration. § Before testing the DNS service, we need to make a small change to the configuration file of the system-resolved service that Fedora uses to provide network name resolution to local applications. On the Linux server VM as root, edit the file /etc/systemd/resolved.conf and add the lines DNS=127.0.0.1 and Domains=nspdomN.local under the [Resolve] section (N is your unique number). HINT: these lines already exist, so you just need to uncomment them (remove # at the front) and change them to add the address and domain specified. Restart the service using the command systemctl restart systemd-resolved and enter the command resolvectl status to verify that the changes appear under the Global section. Do the same procedure on the Linux client VM but use DNS=192.168.100.10 instead. § To test DNS, at the command prompt on the Linux client VM, type nslookup F38server. The DNS server (F38server) should return the IP address 192.168.100.10. Now try a reverse resolution lookup by typing nslookup 192.168.100.10. You should receive the hostname F38server.nspdomN.local in response. § Finally, we will edit the zone files to add an entry for the Windows 10 VM (stationN-Win10) using the IP address obtained by the DHCP server in part 1. Edit the zone files nspdomN.local.db and 192.168.100.db to add the appropriate lines at the end of each file for the A record and PTR record that is required (N is your unique number). Also, DO NOT forget to increase the serial number from 1 to 2 as this is a new change (i.e., change 1 ;serial to 2 ;serial near the top of each file). As before, verify that the files are syntactically correct and restart the named service using systemctl restart named to make sure the changes take effect. You should now be able to get a valid answer using the command nslookup stationN-Win10 on the Linux client. You can also try the command ping stationN-Win10 (i.e., try to ping the Windows VM virtual network interface; note that you may have to turn off the Windows firewall to be able to do so) to verify that you can ping another system using the domain name instead of the IP address. § The BIND DNS server will not start automatically when the system or VM boots unless you issue the command (as root) systemctl enable named. Carry this out to make sure the DNS service is always running when the system starts up. WHAT TO SHOW/SUBMIT: Show (capture) terminal window on Linux client with output from ifconfig command, after new vNIC is added and IP address is configured. Show (capture) contents of nspdomN.local.db forward zone file, showing all 3 A record entries. Show (capture) contents of 192.168.100.db reverse zone file, showing all 3 PTR records. Show (capture) contents of named.conf configuration file, showing changes. Show (capture) terminal window on Linux client showing output of multiple nslookup commands (forward and reverse resolution) and successful pinging of the Windows 10 VM. 5 captures total.
CSE 101 Final Review Problems 1. Determine whether the following statements are True or False. No justification is required. a. n√n = Ω(n2) b. nπ = O(n3) c. n2 = Θ (9log3 (n)) d. n √3n = ω (√n) e. n2 = o(n3) f. ln(n) = o(n) g. 2n = O(n2) h. n1.5 = ω (n1.45) i. n ln(n) = Θ(ln(ln(n))) j. f(n) = ω (f(n)) for any function f(n) 2. Given a Binary Search Tree based on the following C++ struct struct Node{ int key; Node* left; Node* right; }; Complete the recursive C++ function below called TreeWalk() that takes as input a Node pointer R and a string s, then returns a string consisting of all keys in the subtree rooted at R, separated by spaces. The order of the keys depends on the input string s, which will be either "pre", "in" or "post", indicating a pre-order, in-order or a post-order tree walk, respectively. If the input s is not one of the strings "pre", "in" or "post", then your function will return the empty string. The recursion will terminate when R has the value nullptr. std::string TreeWalk(Node* R, std::string s){ // your code starts here // your code ends here } 3. Perform. Dijkstra(G, s) on the weighted digraph below with source vertex s = 5. If at some point two vertices have equal minimum d-values, extract the one with smaller label first from the min Priority Queue. a. Determine the order in which vertices are extracted from the min Priority Queue. b. For each vertex x, determine the values d[x] and p[x]. Solution: 4. Insert the keys: 5, 9, 7, 2, 6, 4, 8, 3, 1, 10 (in order) into an initially empty Binary Search Tree T . (Note: use the Binary Search Tree Insert algorithm to do this.) a. Give the keys in the order printed by a pre-order tree walk. b. Give the keys in the order printed by a post-order tree walk. Note: the three questions below do not refer in any way to the Red Black Tree Insert algorithm. Instead they ask if it is possible to assign colors in the BST T, which you found above, so as to satisfy the RBT properties. Be sure to include nil children when computing the black-height of T. c. Is it possible to assign the colors {Red, Black} to the vertices of T so that the Red-Black Tree properties are satisfied, and bh(T) = 1? If it is possible, specify all such colorings by stating, for each coloring, the set of keys belonging to red nodes. d. Is it possible to assign the colors {Red, Black} to the vertices of T so that the Red-Black Tree properties are satisfied, and bh(T) = 2? If it is possible, specify all such colorings by stating, for each coloring, the set of keys belonging to red nodes. e. Is it possible to assign the colors {Red, Black} to the vertices of T so that the Red-Black Tree properties are satisfied, and bh(T) = 3? If it is possible, specify all such colorings by stating, for each coloring, the set of keys belonging to red nodes. 5. Insert the keys: 6, 2, 1, 4, 3, 5 into an initially empty Red-Black Tree (using the RB_Insert() algorithm), then draw the resulting tree T . Indicate the color of each node in T .
MATH5007 2024 Tri3a: Assignment 3 (Singapore) Question 1: General (30 Marks); max 300 words The unit provided you with knowledge and skill sets on optimisation (and later simulation) and introduced you to models implemented with AMPL. 1. Why is the optimisation knowledge valuable for providing decision support even if you are not in charge of creating and solving optimisation models? Should managers have at least an introduction to optimisation? 8M 2. Explain IN YOUR WORDS the difference between prescriptive and predictive modelling to someone unfamiliar with the subject. 8M 3. Why is modelling a process with toy car models a possible option? Describe a scenario where you could see Lego being used (at least in an early planning stage). 8M 4. Consider removing a constraint from a model; what are the expected changes to the objective function value for the optimal solution in case of a minimisation problem? Provide a short argumentation. 6M Question 2: Network Flow (85 Marks) Answer the following questions about network flow problems. 1. We used a BigM constraint in the MST-problem. Explain the BigM constraint (5M) and why it was required to find the solution (5M). Include in your answer why solving the problem without the BigM constraint results in a wrong answer (5M). Answer the following questions about network flow problems. 1. Map the differences between the four variations discussed in the classroom (Transhipment, Routing, Max-Flow, MST). Consider the data as well as the model. 10M 2. How would you model for the routing problem a toll (that is a fee being paid if the connection is used) and a blocked connection? As there is no specific model given, provide enough details to explain your answer. 10M 3. Solve the following problem using a network flow approach. (overall 50M) A producer of outdoor BBQ-sets has a production period from January to June, with the product being on the shelves at dedicated retail stores from March to August. The capacity and demand (in 1000s) are shown in the table below. The manufacturing cost as well as the carrying cost varies in the considered time periods (cost per 1000). a. Draw the network representation you are using. 10M b. Write the data and model files solving the problem. 35M c. What is the optimal solution? Do not limit the answer to the objective function value but visualise the decision variables as well. 5M Use comments in your model/data files to explain the elements of the model. Month Capacity (in 1000s) Demand (in 1000s) Cost per 1000 Carrying Cost First Month Carrying Cost Other Months January 16 7100 110 55 February 18 7700 110 55 March 20 14 7600 120 55 April 28 20 7800 135 55 May 29 26 7900 150 55 June 36 33 7400 155 55 July 28 August 10 Question 3: Multi-Objective Optimisation (10 Marks) The question has the following tasks: 1. Explain in YOUR words MOO with 100 words max. (10M) 2. Follow the instructions below to modify the given problem. Given is the shopping_moo.mod model. The model itself describes a knapsack problem, where items with a value and a weight have to be packed in a given number of bags. The optimisation target here is either the maximisation of value (zv) or the maximisation of the number of items in the bags (zi). The problem has two constraints. 1. Weight constraint ensure that the weight restrictions of the bags are kept 2. The number of items of each product in all bags is restricted Running the model, you get either a value of 328 with 15 items (zv) or 307 with 19 items (zi). Use Multi-Objective optimisation to see if there is a solution that maximises both objectives. In the data file, there is already a declaration of targets and weights. Your task is to 1. Write the MOO-Objective function (10M) 2. Write the constraint for restricting the deviation to Q (15M) 3. Find the weights (param weights) for the deviation constraint that results in a solution with 16 items. How do you approach a solution for the weights? (15M) Note: Follow the example we had in class. IMPORTANT: It is a maximisation and NOT a minimisation problem. What impact does this have on the deviation constraint? Include in the word document the results and submit the updated shopping_moo.mod Total: 107 Marks (representing 10% of the final mark) The final submission is via email attachments to the lecturer (Adrian). Use your Curtin email account. Attached to the submission should be a Word document with all your answers to the questions (You can use this file and insert your answer in a different colour) and .mod and .dat files based on the questions. You can use a compressed folder or zip file. Use for the filenames the following structure: a3_YOURID_answers.docx for the answer to the questions, a3_YOURID_answers.zip in case you submit all in a compressed file, and a3_YOURID_Q[number of question this file relates to].mod/.dat. Using given formats and naming conventions is crucial in organisations, so we deduct up to 5M if the filenames are not as specified.
SCHOOL OF COMPUTER SCIENCE MASTER OF APPLIED COMPUTING (MAC) ASSIGNMENT 2 (Weightage 15%) SEPTEMBER 2024 SEMESTER (Block 2) MODULE NAME : Principles of AI MODULE CODE : ITS70304 Scenario and Task Description In industries where competition is high and profit margins are low, satisfied and loyal customers can distinguish organizations from their competitors, providing a competitive edge and the potential for greater profits. Research frequently demonstrates that organizations benefit from satisfied and loyal customers, but the factors that contribute to satisfaction and loyalty are not always clear in ways that can be translated to action for practitioners. Prior airline industry studies have revealed that customer satisfaction leads to higher profits and encourages loyalty behaviors. For example, satisfied airline customers are more likely to recommend an airline and repurchase tickets (Kim and Lee, 2011), which contributes to an airline's profitability and increased market share (Buttle, 1996; Dagger et al., 2007; Devlin and Dong, 1994). Additionally, loyal consumers are more willing to forgive a service failure and are more resilient to rising prices (Mattila, 2001). Global Aviation Analytics Market show an increasing trends by year of 2027. Data analytics for an airline passenger satisfaction study typically involves gathering, processing, analyzing, and interpreting data related to the experience of passengers across different dimensions. This process helps airlines understand key drivers of satisfaction, identify areas for improvement, and optimize services Practical Skills Perform. exploratory data analysis and build a predictive model that answers the question: “Passenger satisfied or not satisfied with the airline services” based on the factors identified in the airlinesatisfaction.csv dataset. Write a python program to answer the following. However, before the prediction can be made this dataset needs to be pre-processed before it can be fed into AI prediction model. Pre-process the airlinesatisfaction.csv dataset with Python programming on Google Colab. Each question below required your code. 1. A.I. systems are trained on patterns in certain examples, and because all possible examples cannot be covered, the systems are easily confused when presented with a new scenario. Airline schedules are easily changing every day due to weather condition, technology difficulties and many more reasons. How can AI handle this mistake? (1 mark) 2. Loading dataset into a Pandas DataFrame. and list the libraries you may need to use. Find the following information: (3 marks) a. Number of rows and columns b. Find the basic statistics of all columns and listing the basic information of the columns- find out the data type for each column 3. Identify how many attributes contains missing values. Handle the missing values. Find number of missing values from each attribute and handle them. You may use imputation. Explain your method. (2 marks) 4. Identify 2 main variables that can show high relationship with the target variable in detecting the passenger satisfaction? Plot a heatmap to explore the relationship between them. (3 marks) 5. The target variable (satisfaction) has 3 values (satisfied, neutral, dissatisfied). Change the values into binary classification. Show your code and count how many of them from both new values. (3 marks) 6. Create a single train/test split of the data. Set aside 80% for training, and 20% for testing. Create a Neural Network Modelling and fit it to your training data. Measure the accuracy of the resulting Neural Network model using your test data. (3 marks) To demonstrate a broad and coherent theoretical and technical knowledge comprehension, add comments where necessary throughout the program. Please make sure you copy paste the respective code in your pdf file and explain each of them. Marking Rubrics (lecturer’s use only) Attach as second page in the report. The purpose of this learning assignment is based on the following module learning outcome (MLO): MLO2 — Perform. a knowledge on Data Privacy and Ethical Consideration. Type of activity: Practical Question Weight Outstanding (80 – 100) Mastering (65 – 79) Developing (0 – 64) Practical Skills Demonstrates comprehensive exploration and analysis of AI applications, in a highly logical and extensive manner and able to pre-process the dataset for AI application in the airlines satisfaction prediction modelling. The Python program/code is applied correctly and the solution is clearly elaborated and presented in a step by step manner. The similarity is less than 2%. Demonstrates enough interpretation/evaluation to develop a coherent exploration and analysis of AI applications and able to pre-process the dataset for AI application in the airlines satisfaction prediction modelling. The Python program/code is applied correctly and the solution is NOT clearly elaborated and presented in a step by step manner. The similarity is between 2% to 4%. Demonstrates enough interpretation/evaluation to develop a coherent exploration and analysis of AI applications and unable to pre-process the dataset for AI application in the airlines satisfaction prediction modelling. The Python program/code is applied incorrectly and the solution is NOT clearly elaborated and presented in a step by step manner. The similarity is greater than or equal to 5%. Q1 _____/1 Q2 _____/3 Q3 _____/2 Q4 _____/3 Q5 _____/3 Q6 _____/3 Submission Requirements 1. Font type : Times New Roman 2. Font size : 12 3. Line spacing : 1.5 4. Alignment : Justify Text 5. Document type : .pdf, .ipynb 6. Number of pages : 5 – 12 pages (do not exceed the page limit) 7. Your full report should consist of the following: a) Cover page (Name, ID, Date, Signature, Score) b) Marking Rubrics & Declaration (attach as second page in the report) c) Report of your answer script. d) Appendixes (line spacing = 1.0) · List of references (APA format) · Python script. · Report of similarity score (percentage of similarity score from each source needs to be shown) 8. Start each question on a separate page. 9. All figures and tables are labelled properly. 10. File naming conventions: StudentName_Assignment1
SCHOOL OF COMPUTER SCIENCE MASTER OF APPLIED COMPUTING (MAC) ASSIGNMENT 3 (Weightage 30%) SEPTEMBER 2024 SEMESTER (Block 2) Marking Rubrics (lecturer’s use only)Attach as second page in the report.The purpose of this learning assignment is based on the following module learning outcome (MLO):MLO3 - Proposed and select suitable AI or Machine learning algorithm for a given application.MLO4 - Analyze an AI-based solution for a given application.Type of activity: QuestionWeightOutstanding(80 – 100)Mastering(65 – 79)Developing(0 – 64)Part IAccurately describe in detail how regex works and show clear understanding in it. Correctly list and explain 5 examples of regex functions with code and sample output but no comprehensive explanation .The similarity is between 2% to 4%.Not accurately describe in detail how regex works and show clear understanding in it. Demonstrates comprehensive steps to build the two modelling with correct justification. The similarity is less than 2%. adequately with adequate justification. The similarity is between 2% to 4%.Did not demonstrate comprehensive steps to build the two modelling andPart IIDescribe those TWO (2) matrices correctly. The similarity is less than 2%.Describe those TWO (2) matrices adequately. The similarity is between 2% to 4%.Describe those TWO (2) matrices incorrectly. The similarity is greater than or equal to 5%.Q4_____/4 Their accuracy, precision and recall values are calculated correctly and the comparison between those two models’ performance are explained precisely. The similarity is less than 2%.Their accuracy, precision and recall values are calculated correctly but the comparison between those two models’ performance is explained adequately. The similarity is between 2% to 4%.Their accuracy, precision and recall values are calculated incorrectly and the comparison between those two models’ performance are explained wrongly. The similarity is greater than or equal to 5%.Q6_____/3 Submission Requirements 1. Font type : Times New Roman 2. Font size : 12 3. Line spacing : 1.5 4. Alignment : Justify Text 5. Document type : .pdf, .ipynb 6. Number of pages : 5 – 12 pages (do not exceed the page limit) 7. Your full report should consist of the following: a) Cover page (Name, ID, Date, Signature, Score) b) Marking Rubrics & Declaration (attach as second page in the report) c) Report of your answer script. d) Appendixes (line spacing = 1.0) · List of references (APA format) · Python script. · Report of similarity score (percentage of similarity score from each source needs to be shown) 8. Start each question on a separate page. 9. All figures and tables are labelled properly. 10. File naming conventions: StudentName_Assignment1
BUSI2111-E1 ACCOUNTING INFORMATION SYSTEMS (Sample Exam Paper) SECTION A Attempt ALL questions Each correct answer is worth 3 marks - Total 60 marks For each question, there is only one correct answer Mark the appropriate boxes on the multiple-choice answer card provided 1. For today’s accounting graduates, which one of the following skills is the LEAST expected among other options by employers like multinational enterprises? (c) a. Data Analytics b. Creative and critical thinking c. Computer Programming d. Database Technology 2. Which one of the following is WRONG about AIS? (d) a. The accounting function has more and more relied on information systems b. Information systems were almost a subset of accounting in 1970s c. Even in the 21st century, not all accounting information systems apply information technologies d. AIS is the overlap part between information systems and Accounting nowadays 3. One main reason for big data is: (a) a. The rapid growing of social media and smart phones b. The initiatives for green technology c. Traditional databases can be used to store big data d. Many types of software are more affordable for business 4. Big data includes: (b) a. Data from an organization’s relational database b. Data from real time environmental systems c. Data generated from green technologies d. Data related to leased software services 5. Which one of the followings is NOT an AIS function? (b) a. Processing data and turning it into useful information b. Allocating an organisation’s resources c. Collecting and storing data about an organisation’s resources d. Provide proper controls to safeguard an organisation’s resources 6. Which one of the following is NOT a business process supported by ERP systems? (a) a. Customers b. Human Resources c. General ledger and financial reporting d. Revenue 7. Which one of the following is NOT a benefit from XBRL? (b) a. Reduced data manipulation b. Reduced auditing needs c. Data is interchangeable d. Paperless reporting 8. Which of the following is a FALSE statement about the future trend and expectation of audit? (c) a. The public expect that future auditing can prevent corporate failure b. Auditors will analyse a much larger data set c. Auditing will eventually be conducted by automation technologies d. Future business models based on more advanced technology will create complex audit challenges 9. Based on the COSO’s Internal Control Model, control environment includes: (c) a. Obtaining or generating relevant information to support internal control b. Considering the possibility of fraud c. Commitment to integrity and ethics d. Evaluating the components of internal control 10. Internal control’s main objectives do NOT include: (d) a. Complying with management policies b. Proving accurate and reliable information c. Improving efficiency d. Assuring that no fraud existed 11. Which one of the following is NOT a benefit of database technology: (c) a. Minimal data redundancy b. Data integration c. Decentralized management of data d. Data sharing 12. Assume that when students enrol at UNNC, their name, ID number, address and phone number are recorded by the admissions department. The same information must be completed on their applications for accommodation and on their paper registration materials. These various forms are used to record information in different files maintained by each separate office. What advantage of database technology would improve the data storage and data quality? (c) a. databases involve separate storage of each division’s data in separate files b. databases integrate parallel activities c. databases capture data once and store it once d. users enter their own data when databases are used 13. Which one of the following is a FALSE statement about the difference between the traditional journal/ledger -based accounting system and the database/REA model -based accounting system? (c) a. Databases store more comprehensive data than traditional accounting system. b. Real time reports can be generated by database technologies. c. The traditional approach can generate more accurate information. d. The database approach enhances decision making. 14. Which statement below is FALSE regarding the basic requirements of relational databases? (d) a. All non-key attributes in a table should describe a characteristic about the object identified by the primary key. b. Every column in a row must be single-valued. c. Foreign keys, if not null, must have values that correspond to the value of a primary key in another table. d. Primary keys can be null. 15. If non-primary key items are stored several times in different locations (for example, salesperson name and address are manually entered and stored in multiple tables such as employee table, sales table, payroll table etc.), which anomaly can it cause? (a) a. update anomaly b. delete anomaly c. insert anomaly d. no anomaly 16. Which one of the following is not a possible type of cardinality between entities in an E-R diagram? (d) a. one-to-many b. many-to-many c. one-to-one d. zero-to-one 17. The purpose of normalisation is: (c) a. Normalisation shows the relationships between entities. b. Normalisation identifies the resources, events and agents of a business process. c. Normalisation ensures the tables in a database is well-designed and free from anomalies. d. Normalisation is not important. 18. Which one of the following is NOT a benefit of standard SQL? (d) a. cross-system communication b. reduced training costs c. retrieve information from many sources d. increased dependence on a single vendor 19. Which one of the following is Data Manipulation Language (DML)? (b) a. Create b. Select c. Grant d. Revoke 20. Which one of the following is a FALSE statement on the advantages of visualisation? (c) a. Information can be found more quickly. b. Visualisations can help more people to learn and understand better about the data. c. It takes a lot less time to build visualisations than write a report. d. People process visualisations faster than information in written format. SECTION B Attempt 2 out of 3 questions Total marks available = 40 21. Discussion (20 marks) A computer programmer that works in the payroll department created a fake employee number and entered this employee into the payroll database. The programmer then programmed the payroll system to pay this fake employee and deposit the funds into one of their own bank accounts. Identify at least 2 control procedures that could have prevented the problem, and explain why each of the controls would have helped. 22. REA Diagram with Cardinality (20 marks) Assume that you own five apartments in Dongqian Lake. Your business provides holiday apartment rental to people who take short-term holidays in Dongqian Lake. This is your main source of income. You want to design and build a database to keep track of your business. The following describes how your business is running. The first thing that happens in your business is that one customer reserves a holiday home (i.e. one of your five apartments) for his/her vacation with families and friends. Reservations are made for specific days, and each day is associated with a specific cost. The entire rental must be paid at the same time of reservation. Customers then come to the lake, stay in the holiday home and enjoy their holiday. Feedback and satisfaction rating forms are available in the apartment; customers can fill out the form. if they would like to or if they have any complaints when staying in the holiday home. After customers leave the apartment, one of your employees checks the apartment and identifies anything that has been damaged. You keep a record of the conditions of each apartment after each rental and an estimate of the repair costs after each rental. You have three employees who maintain, clean and check the condition of the apartments. Note: assume that there are no cancellation cases. Required: Given the above brief overview, draw an E-R Diagram for your business. Remember to use the REA methodology of identifying resources, events and agents. Your diagram should also include cardinalities. Also, briefly describe the three steps involved in developing the REA diagram. 23. Table Normalisation (20 marks) Use the following attributes to create tables that meet all of the normalization rules. Note: you do not need to create any new attributes. This is only a portion of a database. Supplier#, Supplier Name, Supplier City, Status of Supplier (e.g., currently used vs. not currently used), Part#, Part Colour, Part Weight, Total Quantity of Parts in Inventory, Quantity of each Part from each Supplier, Warehouse#, Warehouse City, Total Quantity of Each Part in each Warehouse. Note: each Supplier# can offer multiple parts; each Part# can be from multiple suppliers; each Part# can be stored in multiple warehouses; each Warehouse# can store multiple parts. Required: Clearly indicate all primary and foreign keys in your tables. Use the following notation. Also, briefly describe your steps in normalizing this database. • Tables in ( ) • Primary keys Underline • Foreign keys in Italics
115021 – Economics Fundamentals Assessment 3 – Case Study Due Date : Week 10 Saturday Weighting: 30% Length: 800-1000 words Task Description: Read the article written by Tina Morrison “NZ avocado industry warned to brace for lower process as key Aussie market swamped” www.stuff.co.nz/business/129202025/nz-avocado-industry-warned-to-brace- for-lower-prices-as-key-aussie-market-swamped (A copy ofthe article is reproduced below.) Task: 1. Using demand and supply model, explain and illustrate what is happening in the avocado market as described in the article. Your answer needs to clearly explain the impact on market equilibrium price and quantity of avocados 2. Explain, and illustrate with an economic model, the type of pricing policy the government can take to protect consumers and/or protect growers. 3. Assume avocados are sold in a perfectly competitive market. Explain, and illustrate graphically, how the effect of the change in market conditions (as described in the article), will affect an avocado grower that is currently making zero economic profit. (Hint: your answer should include graphs for both market as well as individual firm/farmer.) Learning Outcomes for this Task: LO2 - Explain how economic forces influence the operation of the economy. LO3 - Apply economic analysis to assess different economic problems and evaluate the effects of economic policies on society. Formatting: All assignments should be word processed, have a title and page numbers. Font size should be 12, with 1.5 line spacing. Write your name, student number, and teacher’s name clearly on each page. Draft: You must submit a draft prior to submitting the final version for assessment at a time specified by your teacher(s). You will be expected to take any feedback of your draft into account and put these suggestions into action on your final submission. Citation and Referencing: A complete APA Reference list must be included at the end of the report. You must include at least 4 different research sources in your reference list at the end of your report, with corresponding in-text citations. Assessment Criteria: To seethe details ofthe assessment criteria, please see end ofthis document Academic Misconduct: Any form of academic misconduct is taken very seriously. Work found to be plagiarised, ghost-written, collaborated on will be penalised. Depending on the severity of the misconduct, a mark of zero may be given for the assessment. Please seeyour teacher for guidance on avoiding academic misconduct. Submission: Both the draft and the final writing is to be submitted via your class Stream page via Turnitin. If you are unsure of how to do this, please ask to your teacher(s) before the due date. Marks will be awarded for submissions that are made on time. You will not receive these marks if your assessment is submitted late. Special Consideration : If you are unwell or there is another compelling or compassionate reason that affects your assessment or your ability to submit it on time, you may apply for special consideration.
Assignment 3 Big Data and Machine Learning for Economics and Finance Submission Rules: Provide an html document that is generated by RMarkdown and that con- tains the R code, the R output, and your comments on the output. Comment each line of your R code as well. Give thorough explanations throughout. Please note that the function set.seed() may not be used at any time in the assignment. Please note that, when providing your answers, you may not use any extra packages other than the ones explicitly mentioned in each exercise. For example, if the question says ''the only extra packages allowed are ISLR2 and boot'', then you may type library(ISLR2), and library(boot) when writing your answers to the questions in that exercise, but you may not type library(MASS) or library(any other package) anywhere in your submission. When asked to carry out a certain task (such as, for example, fitting a certain model or running a certain algorithm), it must be determined first whether that task is feasible or not, and when feasible, whether it can be carried out exactly as prescribed in the question or whether it can only be approximately carried out. Exercise 1. (40 points) The only extra packages allowed in this exrecise are tree and boot. Consider the following data generating mechanism: X is a uniform random variable on the interval [0;100] and Y=1{30>X+U}+1{X+U>90} where U is a standard normal random variable that is independent of X . Assuming that X is the input variable and Y is the output variable, we are interested in comparing predictions from classification trees with ones based on Logistic Regressions. 1. Generate a sample of size n=1000 from that model. a. Using R, produce a scatterplot of X vs. Y. b. Produce another plot representing the different observations of X , where each of the observations is given a different color depending on the value of Y. Are the colors separable using a hyperplane? c. We are interested in giving predictions for Y when x=10, 50 or 90 using a clas- sification tree. After fitting a tree to the data, show how to give predictions both using the function predict, and by arguing based on a graphical representation of the tree. d. Run logistic regression and give predictions for the same 3 values of x. e. Compare the prediction performance the two methods. 2. Attempt to reproduce the results in the following figure using R. Figure 1. Monte Carlo Experiment 3. Based on your knowledge, examine all classification methods learned in this course and establish which methods would perform well on samples drawn using the data generating mechanism described in this exercise. Make a table where on the left you write down the name of the method(s), and on the right you explain if you believe it would perform well while justifying your answer. Your answer and justification for each method should not be more than 10 words long. Please note that this part of the exercise should not be answered with any R coding. Exercise 2. (30 points) The only extra packages allowed in this exrecise are tree and boot. An applied data analyst is interested in assessing the performance of supervised learning when applied to the following data generation scheme Z=Y2+U X =exp (Z)+V where Y , U and V are independent standard normal random variables. 1. Generate a sample of size n=10000 from that model. Assuming that Y is the input variable and X is the output variable, we are interested in comparing CART and a Generalized Linear Model. a. Construct a tree and show how to use it in order to make a prediction for X when y=1. Use both the predict function and a plot of the tree to make the prediction. b. Run a generalized linear model and use it to give a prediction for X corre- sponding to the same value of the input variable as the previous question. c. Compare the prediction performance of the two methods. 2. Another applied data analyst looks at a sample of size n generated from the same data generation scheme and concludes that a supervised learning prediction exercise does not make sense as the X and Y variables are seemingly uncorrelated. a. Using the bootstrap, show whether the applied data analyst is correct in their conclusion regarding the correlation. b. Do you believe that the applied data analyst is right in believing that a super- vised learning exercise does not make sense in this particular case? 3. Based on your knowledge, examine all supervised learning methods learned in this course and establish which methods would perform well on samples drawn using the data generation scheme described in this exercise. Make a table where on the left you write down the name of the method(s), and on the right you explain if you believe it would perform well while justifying your answer. Your answer and justification for each method should not be more than 10 words long. Please note that this part of the exericse should not be answered with any R coding. Exercise 3. (30 points) No extra packages are allowed in this exrecise. 1. Consider the following dendrogram: Figure 2. 10 observations We are interested in clustering the data into three groups. What are the three groups obtained from the dendrogram? Which group contains the closest two points in the sample? Is the dendrogram well balanced? 2. Consider the following scatter plot representing the data on two variables X1 and X2 : Figure 3. 4 observations Going from left to right, the same 4 observations have the following values for a third variable Y=(1;2;3;0). a. If Y is considered as the output variable in a supervised learning setting and (X1 ;X2) are considered the input variables, would this be a classification or a regression task? b. If we were to fit a single split tree stump to this dataset, how many possible configurations are there? c. Construct the optimal tree stump.
1. Introduction to Recursive Calls in C LanguageIn C, recursion refers to the process where a function calls itself in order to solve smaller instances of a problem. A recursive function typically has two main components:1. Base case: The condition that stops the recursion to prevent infinite calls.2. Recursive case: The part where the function calls itself with modified parameters.A common example of recursion is the calculation of factorials or Fibonacci numbers.Here's a basic example of a recursive function in C:In this example:• The base case is when n == 0 or n == 1, where the recursion stops.• The recursive case is when the function calls itself with n - 1.Quiz: Understanding Recursive CallsHere are two quizzes to test your understanding of how a recursive function works. Below are C program that contains a recursive function. Predict the output for n = 3.2. Write a program to create a single linked list with four nodes. Each node represents a user containing the name and the age of this user.Example: Input: Output:3. Fill in the blanks or complete the following program to make it work as described.Output: Length of str1: 6 Concatenated string (str1 + str2): APSIPA ASC 2024 Comparison result (strcmp(str1, str4)): -1 (since "APSIPA" is lexicographically smaller than "apsipa") Converted integer + 10: 52 Copied string into str1: ASC 2024
FIN 532 Investment Theory Problem Set 1 Fall 2024 1 Risk, Preferences, and Asset Allocation 1. You bought 100 shares of ABC Inc. common stock at $100 per share today at the opening of the market. ABC Inc. just announced a dividend of $2.00 per share payable in exactly one year from today. It is widely believed that one year from now the economy will either be in a ‘recession’, a state of ‘normal growth’, or a ‘boom’ with probabilities of 30%, 40%, and 30% respectively. After analyzing ABC Inc. you are convinced that the price of ABC stock a year from now in these various states of the economy will be: State of Economy Price of ABC ShareB Recession Normal Growth Boom $80 $110 $130 What are your estimated expected return and volatility over the next year to your investment in ABC stock? 2. TNC mutual fund invests 25% of their assets in IBM stock, 50% in GE stock, and 25% in T-Bills. You invested 50% of your wealth in TNC mutual fund and rest in the T-Bills. What percentage of your wealth is invested in each stock and in the T-Bills? 3. Suppose we are in a world with two equally likely states u and d. And we have three stocks A, B and C. Their net returns are given by the following table. u d A B C Table 1: The net returns of the stocks. Can you find a risk-averse investor who prefers stock A (or B) to stock C? Explain. 4. Now suppose the net returns of the three stocks are given by table 2. u d A B C Table 2: The net returns of the stocks. (a) Let ˜rA and ˜rC be the net returns of stocks A and C respectively. Can you find a random variable ˜z such that ˜rA = ˜rC + ˜z and E[˜z] = 0? (b) Can you find a risk-averse investor who prefers stock A (or B) to stock C? 5. Consider a risky portfolio that ofers a rate of return of 15% per year with a standard deviation of 20% per year. Suppose an investor with mean-variance preferences is indiferent between investing in the risky portfolio and investing in a risk free asset earning 8% per year. a) What is the investor's risk aversion coefficient? b) If allowed to invest in a combination of the risky portfolio and the risk free asset, what proportion would the investor hold in the risky portfolio? c) What is the expected rate of return and the standard deviation of the rate of return on the optimally chosen combination? d) What would be the investor's certainty equivalent return for the optimally chosen combination? 6. In this question, you are asked to evaluate the common portfolio advice of a 60/40 split between stocks and bonds. Suppose the expected rate of return on equities is 8%/year and the standard deviation of the return on equities is 19%/year. T-Bills earn 1%/year (assume they are riskless). (a) What is the implied risk aversion coefficient of an investor for whom a 60/40 split is optimal? (b) Plot the CAL along with a couple of indiference curves for the investor type identified above. 7. For this exercise, you will have to download data on equity returns from 1926 to 2022 from Kenneth French’s Data library (http://mba.tuck.dartmouth.edu/pages/ faculty/ken.french/data_library.html). You will download data on the excess returns of stocks over T-bills; they are available near the top of the page under Fama/French 3 Factors. You need the variable Mkt-RF. The variable is available at 4 diferent frequencies: annual, monthly, weekly and daily. (a) Compute the mean and standard deviation of stock returns at diferent frequen- cies, including their standard errors. To make results comparable, express every- thing in an annual frequency. To a first order, this means multiplying monthly returns by 12, weekly returns by 52, and daily returns by 250 (there are approxi- mately 250 trading days in a year). Compare your estimate of the mean and standard deviation (of annualized re- turns) across these diferent frequencies. How does the precision of your estimates (the tightness of confidence intervals change?). Discuss. (b) For each decade, compute the mean return in the stock market, and volatility. You can use monthly data for this exercise. Do your estimates of the mean and volatility vary across decades? Are your estimates statistically diferent? (c) For this part, we will only use daily returns. For each year in the sample, compute the realized volatility (i.e. standard deviation) of daily market returns. Plot the resulting yearly observations. Is market volatility constant over time? For this exercise, we will need a software package that allows you to estimate means and standard deviations, along with confidence intervals. If the software you use does not provide you with standard errors, you can consult your statistics textbook (or Wikipedia) and you can compute them manually.
Principles & Methods of Epidemiology: Paper Critique Weighting: Paper critique: 1,500 words; 65% of overall grade Due date: Friday 10th January 2025, 18.00 Assignment description: This is a written assessment. This assessment will be based on the following reading: Lee et al. Moderate alcohol intake reduces risk of ischemic stroke in Korea. Neurology (85), Dec 1, 2015. You will also find the article on Blackboard. Utilising the concepts you have learned, please write a critique of the Lee et al. (2015) journal article that identifies, evaluates, and responds to the authors’ ideas, both positively and negatively. As you complete this assessment, you may use further literature if you wish, but this is not necessary. Please note that although guiding questions are provided, your submission should beanessay (not a series of questions and answers) and you do not need to follow the order of the guiding questions. Critique questions: Q1: What is the main aim or research question? Q2: Why do you think the authors have chosen this study design? Do you think that an alternative study design may have been more suitable? Explain why or why not. Q3: What are potential sources of information bias in this study? Have the authors made any efforts to address these? Were they successful? What else might have been done? Q4: What are potential sources of selection bias in this study? Have the authors made any efforts to address these? Were they successful? What else might have been done? Q5: Using data from the paper, calculate and interpret the unadjusted Odds Ratio (and it’s 95% Confidence Interval) of stroke among those with an average intake of 5-6 drinks vs. non- drinkers. How does it compare with the adjusted OR and why? Q6: What are potential sources of confounding in this study? Have the authors made any efforts to address these? Were they successful? What else might have been done? Q7: Is there evidence of effect modification in the association between alcohol intake and stroke? Have the authors explored this? If yes, how? If not, what could they have done? Q8: Taking into account the above and any other relevant factors, have the authors given a satisfactory answer to their research question? General Guidance: You may wish to think about this general guidance: This critique is not designed to be a summary of the manuscript. Analyse the authors’ arguments, and whether they are successful in conveying their research and findings, but also think beyond what is mentioned in the manuscript. Think about how this article fits within the broader literature on the topic from a public health point of view. Do not expand beyond 1500 words. The markers will not consider any text that is beyond the word limit.
STATS 769 STATISTICS Data Science Practice SECOND SEMESTER, 2016 1. [5 marks] Figure 1 shows the content of a JSON file, "AT.json", and the following code reads this file into R and shows the resulting R object, trips. > library(jsonlite) > trips trips id stop_time_update.stop_sequence stop_time_update.stop_id timestamp 1 2928 20 7812 1474316682 2 2929 60 6569 1474316645 > dim(trips) [1] 2 3 > names(trips) [1] "vehicle" "stop_time_update" "timestamp" Explain what sort of R object has been created and write R code to extract the stop_id information from the R object trips. [ { "vehicle": { "id": "2928" }, "stop_time_update": { "stop_sequence": 20, "stop_id": "7812" }, "timestamp": 1474316682 }, { "vehicle": { "id": "2929" }, "stop_time_update": { "stop_sequence": 60, "stop_id": "6569" }, "timestamp": 1474316645 } ] Figure 1: The JSON file "AT.json". 2. [5 marks] Figure 2 shows the content of an XML file, "IRD.xml". Write down the result of the following R code: > library(xml2) > ird xml_text(xml_find_all(ird, "//td[@align = ✬right✬]")) Write R code to extract the first column of values from the table. Your code should produce the following result: [1] "18 NCO Club" "1977 Masters Association" [3] "1979 Reunion" "1993 Summer Camp Account" [5] "1St Wainuiomata Venterer Unit" "44 South Travel" [7] "81 Masters Association" 18 NCO Club $142.03 1977 Masters Association $359.77 1979 Reunion $532.77 1993 Summer Camp Account $1,308.78 1St Wainuiomata Venterer Unit $431.14 44 South Travel $489.60 81 Masters Association $221.08 Figure 2: The XML file "IRD.xml". 3. [5 marks] Explain what each of the following shell commands is doing and, where there is output, what the output means: pmur002@sc-stat-346130:/home/paul$ ssh stats769prd01.its.auckland.ac.nz pmur002@stats769prd01:~$ mkdir exam pmur002@stats769prd01:~$ cd exam pmur002@stats769prd01:~/exam$ cp /course/data/Ass2/exreg-10000.* . pmur002@stats769prd01:~/exam$ ls -lh total 16M -rw-rw---- 1 pmur002 pmur002 7.8M Sep 20 12:12 exreg-10000.bin -rw-rw---- 1 pmur002 pmur002 473 Sep 20 12:12 exreg-10000.desc -rw-rw---- 1 pmur002 pmur002 7.7M Sep 20 12:12 exreg-10000.txt pmur002@stats769prd01:~/exam$ wc exreg-10000.txt 10000 1010000 7979235 exreg-10000.txt pmur002@stats769prd01:~/exam$ awk ✬NR < 1000✬ exreg-10000.txt > exreg-sub.txt 4. [5 marks] Explain the memory usage results from the following R code and output. What is the significance of the NULL at the end of the function definition? > gc(reset=TRUE) used (Mb) gc trigger (Mb) max used (Mb) Ncells 326457 17.5 592000 31.7 326457 17.5 Vcells 535537 4.1 1023718 7.9 535537 4.1 > x gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 326423 17.5 592000 31.7 328238 17.6 Vcells 1535490 11.8 2613614 20.0 1537444 11.8 > f gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 326438 17.5 592000 31.7 330982 17.7 Vcells 1535523 11.8 2613614 20.0 2541622 19.4 5. [5 marks] Figure 3 shows the first few lines of a CSV file called "1987.csv". This file contains flight data from 1987 for flights within the USA. Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarr ier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay ,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,Carrier Delay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay 1987,10,14,3,741,730,912,849,PS,1451,NA,91,79,NA,23,11,SAN,SFO,447,NA,NA,0,NA,0, NA,NA,NA,NA,NA 1987,10,15,4,729,730,903,849,PS,1451,NA,94,79,NA,14,-1,SAN,SFO,447,NA,NA,0,NA,0, NA,NA,NA,NA,NA ... Figure 3: The first few lines of the CSV file "1987.csv". The following R code and output shows the time and memory requirements involved in naively reading the file "1987.csv" into an R data frame. and then calculating the average flight departure delay for each day of the week. > system.time(f1987 object.size(f1987) 152204024 bytes > system.time(delays delays DoW DepDelay 1 1 7.827491 2 2 9.086585 3 3 9.364805 4 4 8.143775 5 5 7.411825 6 6 6.034632 7 7 8.408912 Write R code that calls the read.csv() function, but with additional argument that would make the call run faster and create a smaller data frame. than the code above. Explain why your code would be faster and use less memory. Write R code that uses functions from the data.table package to read the file "1987.csv" into R much faster and to calculate the average departure delay for each day of the week much faster. 6. [5 marks] The following R code and output shows the number of cores and the memory capacity of one of the virtual machines used in this course. > detectCores() [1] 20 > system("free") total used free shared buffers cached Mem: 206350080 42760988 163589092 44 262884 40465868 -/+ buffers/cache: 2032236 204317844 Swap: 1949692 131744 1817948 The following R code and output shows information about the full set of CSV files ("1987.csv" to "2008.csv") that contain US flight data over 22 years. > system("ls -lh /course/data/ASADataExpo/*.csv") -rw-r--r-- 1 pmur002 pmur002 122M Aug 4 14:48 /course/data/ASADataExpo/1987.csv -rw-r--r-- 1 pmur002 pmur002 478M Aug 4 14:48 /course/data/ASADataExpo/1988.csv -rw-r--r-- 1 pmur002 pmur002 464M Aug 4 14:48 /course/data/ASADataExpo/1989.csv -rw-r--r-- 1 pmur002 pmur002 486M Aug 4 14:50 /course/data/ASADataExpo/1990.csv -rw-r--r-- 1 pmur002 pmur002 469M Aug 4 14:47 /course/data/ASADataExpo/1991.csv -rw-r--r-- 1 pmur002 pmur002 470M Aug 4 14:49 /course/data/ASADataExpo/1992.csv -rw-r--r-- 1 pmur002 pmur002 469M Aug 4 14:49 /course/data/ASADataExpo/1993.csv -rw-r--r-- 1 pmur002 pmur002 479M Aug 4 14:48 /course/data/ASADataExpo/1994.csv -rw-r--r-- 1 pmur002 pmur002 507M Aug 4 14:48 /course/data/ASADataExpo/1995.csv -rw-r--r-- 1 pmur002 pmur002 510M Aug 4 14:52 /course/data/ASADataExpo/1996.csv -rw-r--r-- 1 pmur002 pmur002 516M Aug 4 14:49 /course/data/ASADataExpo/1997.csv -rw-r--r-- 1 pmur002 pmur002 514M Aug 4 14:47 /course/data/ASADataExpo/1998.csv -rw-r--r-- 1 pmur002 pmur002 528M Aug 4 14:48 /course/data/ASADataExpo/1999.csv -rw-r--r-- 1 pmur002 pmur002 544M Aug 4 14:47 /course/data/ASADataExpo/2000.csv -rw-r--r-- 1 pmur002 pmur002 573M Aug 4 14:48 /course/data/ASADataExpo/2001.csv -rw-r--r-- 1 pmur002 pmur002 506M Aug 4 14:47 /course/data/ASADataExpo/2002.csv -rw-r--r-- 1 pmur002 pmur002 598M Aug 4 14:50 /course/data/ASADataExpo/2003.csv -rw-r--r-- 1 pmur002 pmur002 639M Aug 4 14:47 /course/data/ASADataExpo/2004.csv -rw-r--r-- 1 pmur002 pmur002 640M Aug 4 14:49 /course/data/ASADataExpo/2005.csv -rw-r--r-- 1 pmur002 pmur002 641M Aug 4 14:49 /course/data/ASADataExpo/2006.csv -rw-r--r-- 1 pmur002 pmur002 671M Aug 4 14:50 /course/data/ASADataExpo/2007.csv -rw-r--r-- 1 pmur002 pmur002 658M Aug 4 14:47 /course/data/ASADataExpo/2008.csv The following R code could be used to read all 22 CSV files into R as a single data frame. and calculate the average departure delay for each day of the week. filenames meanDepDelay [,1] [,2] [1,] 1 7.850057 [2,] 2 6.855870 [3,] 3 7.651197 [4,] 4 9.246910 [5,] 5 10.151539 [6,] 6 6.887023 [7,] 7 8.409293 Discuss the best way to schedule the 22 calls to sumFile() across the multiple cores (hint: think about whether each call to sumFile(), which handles a different CSV file, will take the same amount of time to run). 8. [5 marks] The following code was used to profile the sumFile() function. > Rprof("sumFile.log") > sumFile("1987.csv") > Rprof(NULL) The profile results are summarised below using summaryRprof() ... > summaryRprof("sumFile.log") $by.self self.time self.pct total.time total.pct "fread" 1.38 97.18 1.38 97.18 "!" 0.02 1.41 0.02 1.41 "forderv" 0.02 1.41 0.02 1.41 $by.total total.time total.pct self.time self.pct "sumFile" 1.42 100.00 0.00 0.00 "fread" 1.38 97.18 1.38 97.18 "[" 0.04 2.82 0.00 0.00 "[.data.table" 0.04 2.82 0.00 0.00 "!" 0.02 1.41 0.02 1.41 "forderv" 0.02 1.41 0.02 1.41 $sample.interval [1] 0.02 $sampling.time [1] 1.42 ... and using profReport(). > profReport("sumFile.log") sumFile > fread --------------- 1.38 sumFile > [ > [.data.table > ! ------------------------------ 0.02 sumFile > [ > [.data.table > forderv ------------------------------------ 0.02 Explain the profile results and what they tell us about how the sumFile() function works and where it spent most of its time. 9. [5 marks] i. Explain why we might need to use the functions GET() or POST() from the httr package to access a web site, rather than just the download.file() function. ii. Explain why we might need to use the functions makeCluster() and clusterApply(), instead of the mclapply() function. 10. [10 marks] Table 1 shows the SSresidual and adjusted R 2 values for a model selection procedure of a linear regression problem with 6 input variables, x1,..., x6. At step i, the model contains the input variable which is shown in column “Variables entered” and the variables at all previous steps. At step i, the SSresidual column shows the value of residual sum of squares after the variable in the “Variables entered” column has been added to the previous model (from step i − 1). This value is the smallest SSresidual among all possible models that add a single predictor to the previous model (from step i−1). 1. Which of the model selection procedures is used in this problem? 2. Based on Table 1, what is the best linear predictor for this problem? Variables Step entered SSresidual Adjusted R 2 0 Intercept 34.026 0.700 1 x4 33.789 0.707 2 x3 33.584 0.703 3 x2 33.583 0.713 4 x5 33.586 0.714 5 x1 33.590 0.713 6 x6 33.595 0.701 Table 1: Summary of the model selection procedure 11. [10 marks] The following code has been used to fit a k-nearest neighbor classifier for given data. > library(class) > train test cl knnout knnout [1] s s s s s s s s s s s s s s s s s s s s s s s s s c c v [29] c c c c c v c c c c c c c c c c c c c c c c v c c v v v [57] v v v v v v v c v v v v v v v v v v v Levels: c s v > table(knnout,cl) cl knnout c s v c 23 0 3 s 0 25 0 v 2 0 22 Based on the output of the code, calculate 1. the misclassification rate for test data. 2. the sensitivity and specificity for test data. 12. [10 marks] Kernel density estimation is an unsupervised learning procedure that estimates the proba bility density of a new observation x0 by counting observations close to it with weights that decrease with distance from it. Formally, for a given univariate random sample x1,..., xn drawn from a probability density gX (x), kernel density estimation uses the following formula to estimate the probability density gX (x) of a new observation x0, gˆX (x0) = nh/1n∑i=1Kh(x0, xi) where n is the sample size and h is a tuning parameter for the kernel function K. Assume that a given learning sample L, with one input variable, contains 100 data points from population I and 100 data points from population II. We want to classify the population of a new observation based on this learning set. This problem is a classification problem with one input variable and a response with two categories. Explain a classifier algorithm which uses kernel density estimation to classify the new observation. 13. [10 marks] Explain the k-nearest neighbor algorithm for fitting models (a) when doing prediction of a continuous response, and (b) when doing classification. 14. [10 marks] Explain the steps of how you would build a model for a classification problem when you receive a data set of 30,000 observations, 100 inputs and a categorical response variable.
FIN 532 Investment Theory Problem Set 2 Fall 2024 1 Constructing the Minimum Variance Frontier You are considering investing in two stocks. There are two possible states for the economy over the next year: ‘Good’ and ‘Bad’ . Each state is equally likely (that is, probability for each state is 50%). Their return in each possible state is estimated as follows: State Return to stock A Return to stock B Good 30% 5% Bad 10% 10% (a) What are the expected return and volatility of each stock return? (b) What are the covariance and correlation between the two stock returns? (c) Construct the minimum variance frontier that is possible by investing in these two stocks (assume no short selling). (d) Suppose that a risk free rate of 5% is for borrowing or lending. Can you construct a portfolio with no risk and a return greater then the risk free rate? Explain. 2 MPT with Two Risky Assets You have recently inherited $150,000 and have decided that you should invest the money. You have identified three funds which seem like a good fit for your investment goals: a risk free short-term bond fund (f ), a long-term bond fund (B), and a stock market index fund (S). Your research revealed the following information about the 3 funds: Expected Return: E(r) Volatility: σ(r) Risk-Free Fund (f ) 0.035 0.000 Long-Term Bond Fund (B) 0.060 0.075 Stock Market Index Fund (S) 0.110 0.18 Correlation between B and S: ρBS = 0.75 (a) First you consider investing 1/3 of your inheritance in each of the 3 funds. What is the expected return and volatility of this portfolio? (b) Having taken the first two weeks of Fin-532, you know that you can construct a more efficient portfolio than simply putting an equal weight of your inheritance in each fund. You start by constructing the Mean-Variance Efficient (MVE) portfolio. (i) What are the portfolio weights in the MVE portfolio? (ii) What is the expected return and volatility of the MVE portfolio? (c) After considering your investment goals and risk tolerance, you’ve decided to make sure your portfolio volatility is no greater than 12%. Given this restriction, how should you allocate the inheritance between (f , B, and S) to maximize your expected return? 3 Matrix Algebra and Portfolio Moments The file HW2data.csv contains monthly historical returns of 5 industry portfolios from July 1926 to June 2023. (Source: Professor Kenneth French’s website.) To get you started in Matlab, first save the data file into a directory on your computer. Then create a new script file and save it in the same directory. At the begining of the script file, you can use the following code to read in the data: T = readtable(’HW2data . csv’); Rets = csvread(’HW2data . csv’,1,1); Rets is a matrix with 5 columns where each column is a time-series of returns for a different industry portfolio. T is a 6 column matrix, where the first column has the month/year of the return (a) Compute the annualized expected returns and covariance matrix. (Hint: Estimate the expected returns and covariance matrix of monthly returns first, then multiply them by 12.) (b) Next construct the correlation matrix (recall that the correlation between asset A and B, is given by ρAB = Cov(rA , rB )/(σA σB)). (c) Suppose the risk free rate is 1%. Compute the Sharpe ratio of each of the industry portfolios. (d) Compute the expected return and standard deviation of an equal weighted portfolio with weight 1/5 for each industry portfolio. What is the Sharpe Ratio of this portfolio? (e) Looking at the Sharpe Ratio of each industry portfolio as well as the correlation matrix, which of the industry portfolios looks most attractive? Which portfolio looks least attractive? (f) Using your answer to the previous question, find a modification of the equally weighted portfolio (i.e., add some weight to one portfolio and subtract some weight from another) that delivers a higher Sharpe Ratio than both the equally weighted portfolio and any of the individual industry portfolios.
CEG8526: Hydrosystems modelling and management Practical 1: Using climate model information Aim and learning outcomes The purpose of this practical is to use some simple online tools to understand how information from GCMs is commonly presented and how this output may be interpreted to understand future climate change. The practical also encourages you to think about the limitations of these tools and the information they provide in terms of assessing and responding to the impacts of climate change. After completing this practical you should be able to: • Interpret climate data (observations and model output) using standard methods of climate model data visualisation. • Summarise regional changes in key climate variables from CMIP6 models quantitatively. • Understand and explain the sources of uncertainty in climate model projections of future climate. • Identify the limitations of climate model output for decision-making. The understanding of climate model outputs gained in this practical will be essential in preparing to use the UKCP18 projections in your assessed coursework. Practical summary General circulation models (GCMs) area fundamental tool in deriving global-scale projections of future climate while regional climate models (RCMs) provide more local scale information at higher resolutions. GCMs thus provide information that underlies global action to mitigate climate change and also provide the boundary conditions for RCMS that are used to understand the potential regional and local impacts of climate change on society and how we might adapt in the future. In this practical you will use a freely available web tool to explore the latest climate projections to produce a summary of regional climate change. The IPCC WGI Interactive Atlas is a tool providing spatial and temporal analyses of much of the observed and projected climate change information underpinning the Working Group I contribution to the Sixth Assessment Report. The IPCC Atlas homepage allows you to access its different features. The first time you access the homepage you will have the option of a quick tour of the functionality - please take a few moments to take the tour as this will help familiarise you with the location of the various tools. The atlas provides three main products accompanied by documentation and user guidance: Simple: allows you to view annual and seasonal climate and climate change under 1.5°C, 2°C, 3°C and 4°C of global warming. Advanced: allows you to view annual and seasonal climate and climate change for different scenarios using a range of different model experiments (CMIP5, CMIP6, CORDEX). You can also examine model simulations over different historical periods as well as observations from different global and regional datasets. Regional synthesis: allows you to examine mapped or tabulated summaries of projections (with confidence information) and past trends (with attribution information) Documentation: provides access to user guidance, videos and tutorials. The user guidance page provides information to help you understand the different forms of output provided, including the time series, annual cycle, global warming level (GWL) and climate stripe plots. You should work through this worksheet in order, and answer the questions 1A, 2A-C, 3A-C, and 4A. Activity 1 - deriving projections using the simple tool Complete the following tasks and where you see a symbol, note down your answers for discussion later. To get used to navigating the atlas and to the type of information it provides the first task uses the simple option (option ). Once in the simple atlas, use the menuselections for variable, quantity & scenario and season: 1A. Derive projections of the change in maximum (daily) temperature (TX) for the South Asia region (centred on India) in summer (June to August) under 1.5°C and 4°C of global warming, including estimates of the uncertainty in the projections. To do this you will need to select the region from the map and then select the relevant options from the menu bar. You can then use the range of options at the bottom of the display to view the regionally aggregated information in different formats. The ‘Table Summary’ will be most useful for completing the table below. Here you should derive the average change and the P10|P90 ranges. What do the ranges e.g. P10|P90 represent here? Global warming level Change in TX and P10|P90 range 1.5°C 4°C You can check your answers, including comparing maps on the Canvas page for this practical. The default setting is for the regionally aggregated information to be presented in small plots under the map. You can enlarge these by dragging the symbol up to cover the map. When you have finished, you can drag it back down again to revert to the map view. Activity 2 - comparing the performance of models against observed data Next use the advanced option (option ). You can do this by clicking on Home in the top right or selecting the advanced option from the dropdown menu. You will now see that you have an extra option - a choice of different datasets which provides the option to look at model projections (future), model historical (simulations of past climate) and observations (datasets of historical weather observations). Using the menuselections for variable, quantity & scenario and season, this time we will explore some of the observational datasets in the atlas. Select the GPCC (Global Precipitation Climatology Centre) dataset from the list of observations to look at Total precipitation (PR). For a given observational dataset the available variables are highlighted in white - you will note that each dataset only provides certain variables. First, using the period 1961-2015 look at the value (mm/day) for this variable (for observations you can also examine the trend). Use the zoom tool to focus in on Australia and examine the data for the December-February season. You should get amap like the figure below: Note that we don’t have complete coverage of rain gauges as the map would suggest but researchers can produce datasets of our observations on a grid through statistical techniques of spatial interpolation. This allows us to better compare observations with models but be aware of the fact that this creates uncertainty in the estimated gridded observations. Now click on the Duplicate map icon on the right. This will split the screen in two, with a duplicate of the map. Each side of the map will now have its own set of menus to the left or right. Using one of the maps change the datasetselection to Model Historical|CMIP6 (at the bottom of the menu). In the quantity & scenario menu select a time period that closely match – 1980-2015 for the observations and 1981-2010 for the CMIP6 historical simulations. This is atypical model performance approach for validating climate models. You can again check your maps on Canvas. 2A. Write a short description of how well the climate models reproduce the observed patterns of PR. Include some quantitative information - you can use the point information tool o to compare the data at specific locations. For example, do the models simulate the magnitude (values) and spatial patterns of mean precipitation well? 2B. List any other features of precipitation you might want to validate against the observations in addition to the seasonal average if you were interested in using the models to assess future impacts of the climate on society. Now, using the map that currently shows the historical observations, using the dataset menu, change the dataset from observations/GPCC to model historical/CORDEX Australasia for 1981-2010. 2C. Write down your observations about the difference in the resolution of the data compared with that from the GCMs. What type of model has been used in the CORDEX experiments? What features of the climate might be better represented by these models and why? Activity 3 - exploring projections of climate change using scenarios Finally, we will look at some projections using CMIP6 and scenarios. Revert back to a single map view using the symbol. Select CMIP6 from the dataset menu and from the variable list we will look at one of the precipitation extremes indices, maximum 1-day precipitation (RX1day) for December-February. Look at the long-term percentage change (2081-2100) for scenarioSSP5-8.5 relative to the baseline of 1981-2010 for Northern Europe. RX1day is an example of aclimate index. An index is a simple diagnostic quantity that is used to characterize an aspect of the climate system such as extreme temperature or precipitation. A common set of indices used in climate analyses are those defined by the Expert Team on Climate Change Detection and Indices (ETCCDI) and these have been calculated for visualisation in the climate change atlas. Look at the other indices that are available. 3A. Write down abrief summary of the projected changes for the region (you can also use any regional summary plots in the lower panel), commenting on how confident you would be in applying these changes in practice to assess future flood risk. To do this, use your knowledge of the limitations of these models, and also the uncertainty information provided in the atlas. Werecommend you use the simple visualisation of uncertainty: Remember, the maps show the mean of a collection/ensemble of models. The number of models used is shown in the title of each map. For this simple method, no overlay indicates the different models in the ensemble are in agreement in that at least 80% of the models agree in the sign of change in individual grid cells. An overlay of diagonal lines (/) indicates low model agreement, where fewer than 80% of models agree on the sign of the change. Next, examine the Global Warming Level (GWL) plot for this region showing the relationship between warming and RX1day (see below). To do this make sure you select value (mm) and not change (%) in the quantity selection option. 3B. Write down your observations of what this tells us about the relationship between temperature and extreme precipitation? What is the physical basis for this relationship? Finally, compare the CMIP6 projections with those for the CORDEX Europe experiment. 3C. If you were interested in applying climate change uplifts for extreme precipitation for flash flooding in urbanareas how confident would you be in applying the output from each product? Activity 4 - applying climate model information Finally, let’s put all this information together with a plausible working scenario where you might use a range of information available in the atlas. You are required to produce a climate change assessment for a conference of regional city leaders. You need to produce a summary providing evidence for why they need to develop climate change plans in their administration – both in terms of mitigation of climate change and adaptation to future climatic (and hydrological) hazards. You should use the IPCC Atlas to provide evidence making the case for action and declare a Climate Emergency (see statement onNewcastle’s Climate Emergencyas an example). This might include evidence from the Atlas of: • already observed climate change using the option to visualise trends in observed data; • projections of future change. You should look at a range of scenarios or global warming levels to demonstrate the effect of potential emissions reductions and also examine a range of variables, indices and indicators that demonstrate the potential impacts that they may need to plan for across different sectors to underpin the declaration of an emergency. 4A. Select one region of interest to you and for that region use appropriate model output to write down at least three pieces of data you would provide to the leaders (where appropriate you should relate it to specific types of hazards that might affect the region). You should also reflect on the degree to which the information available tells you what you/city leaders need to know. a. Write down the limitations of the information you have provided and ideas for how you might go about providing a more robust estimate of regional climate change. b. Make alist of what other information/data and tools might you need to translate this into an assessment of local climate change and the impacts of that change (remember higher temperatures or more intense rainfall are changes in climate but are not impacts on human or natural systems). Once complete post your summary in theclimate change discussion boardand you may compare your analyses with theregional fact sheetspublished by the IPCC.