Assignment Chef icon Assignment Chef

Browse assignments

Assignment catalog

33,401 assignments available

[SOLVED] Stat 435 homeworks 1 to 8 solution

1. We will perform k-nearest-neighbors in this problem, in a setting with 2 classes, 25 observations per class, and p = 2 features. We will call one class the “red” class and the other class the “blue” class. The observations in the red class are drawn i.i.d. from a Np(µr, I) distribution, and the observations in the blue class are drawn i.i.d. from a Np(µb, I) distribution, where µr =  0 0  is the mean in the red class, and where µb =  1.5 1.5  is the mean in the blue class. (a) Generate a training set, consisting of 25 observations from the red class and 25 observations from the blue class. (You will want to use the R function rnorm.) Plot the training set. Make sure that the axes are properly labeled, and that the observations are colored according to their class label. (b) Now generate a test set consisting of 25 observations from the red class and 25 observations from the blue class. On a single plot, display both the training and test set, using one symbol to indicate training observations (e.g. circles) and another symbol to indicate the test observations (e.g. squares). Make sure that the axes are properly labeled, that the symbols for training and test observations are explained in a legend, and that the observations are colored according to their class label. (c) Using the knn function in the library class, fit a k-nearest neighbors model on the training set, for a range of values of k from 1 to 20. Make a plot that displays the value of 1/k on the x-axis, and classification error (both training error and test error) on the y-axis. Make sure all axes and curves are properly labeled. Explain your results. 1 (d) For the value of k that resulted in the smallest test error in part (c) above, make a plot displaying the test observations as well as their true and predicted class labels. Make sure that all axes and points are clearly labeled. (e) In this example, what is the Bayes error rate? Justify your answer. 2. We will once again perform k-nearest-neighbors in a setting with p = 2 features. But this time, we’ll generate the data differently: let X1 ∼ Unif[0, 1] and X2 ∼ Unif[0, 1], i.e. the observations for each feature are i.i.d. from a uniform distribution. An observation belongs to class “red” if (X1−0.5)2+(X2−0.5)2 > 0.15 and X1 > 0.5; to class “green” if (X1 − 0.5)2 + (X2 − 0.5)2 > 0.15 and X1 ≤ 0.5; and to class “blue” otherwise. (a) Generate a training set of n = 200 observations. (You will want to use the R function runif.) Plot the training set. Make sure that the axes are properly labeled, and that the observations are colored according to their class label. (b) Now generate a test set consisting of another 200 observations. On a single plot, display both the training and test set, using one symbol to indicate training observations (e.g. circles) and another symbol to indicate the test observations (e.g. squares). Make sure that the axes are properly labeled, that the symbols for training and test observations are explained in a legend, and that the observations are colored according to their class label. (c) Using the knn function in the library class, fit a k-nearest neighbors model on the training set, for a range of values of k from 1 to 50. Make a plot that displays the value of 1/k on the x-axis, and classification error (both training error and test error) on the y-axis. Make sure all axes and curves are properly labeled. Explain your results. (d) For the value of k that resulted in the smallest test error in part (c) above, make a plot displaying the test observations as well as their true and predicted class labels. Make sure that all axes and points are clearly labeled. (e) In this example, what is the Bayes error rate? Justify your answer, and explain how it relates to your findings in (c) and (d). 3. For each scenario, determine whether it is a regression or a classification problem, determine whether the goal is inference or prediction, and state the values of n (sample size) and p (number of predictors). (a) I want to predict each student’s final exam score based on his or her homework scores. There are 50 students enrolled in the course, and each student has completed 8 homeworks. 2 (b) I want to understand the factors that contribute to whether or not a student passes this course. The factors that I consider are (i) whether or not the student has previous programming experience; (ii) whether or not the student has previously studied linear algebra; (iii) whether or not the student has taken a previous stats/probability course; (iv) whether or not the student attends office hours; (v) the student’s overall GPA; (vi) the student’s year (e.g. freshman, sophomore, junior, senior, or grad student). I have data for all 50 students enrolled in the course. 4. In each setting, would you generally expect a flexible or an inflexible statistical machine learning method to perform better? Justify your answer. (a) Sample size n is very small, and number of predictors p is very large. (b) Sample size n is very large, and number of predictors p is very small. (c) Relationship between predictors and response is highly non-linear. (d) The variance of the error terms, i.e. σ 2 = Var(), is extremely high. 5. This question has to do with the bias-variance decomposition. (a) Make a sketch of typical (squared) bias, variance, training error, test error, and Bayes (or irreducible) error curves, on a single plot, as we go from less flexible statistical learning methods to more flexible approaches. The x-axis should represent the amount of flexibility in the model, and the y-axis should represent the values of each curve. There should be five curves. Make sure to label each one. (b) Explain why each of the five curves has the shape displayed in (a). 6. This exercise involves the Boston housing data set, which is part of the MASS library in R. (a) How many rows are in this data set? How many columns? What do the rows and columns represent? (b) Make some pairwise scatterplots of the predictors (columns) in this data set. Describe your findings. (c) Are any of the predictors associated with per capita crime rate? If so, explain the relationship. (d) Do any of the suburbs of Boston appear to have particularly high crime rates? Tax rates? Pupil-teacher ratios? Comment on the range of each predictor. (e) How many of the suburbs in this data set bound the Charles river? (f) What are the mean and standard deviation of the pupil-teacher ratio among the towns in this data set? 3 (g) Which suburb of Boston has highest median value of owner-occupied homes? What are the values of the other predictors for that suburb, and how do those values compare to the overall ranges for those predictors? Comment on your findings. (h) In this data set, how many of the suburbs average more than six rooms per dwelling? More than eight rooms per dwelling? Comment on the suburbs that average more than eight rooms per dwelling.1. Suppose we have a quantitative response Y , and a single feature X ∈ R. Let RSS1 denote the residual sum of squares that results from fitting the model Y = β0 + β1X +  using least squares. Let RSS12 denote the residual sum of squares that results from fitting the model Y = β0 + β1X + β2X 2 +  using least squares. (a) Prove that RSS12 ≤ RSS1. (b) Prove that the R2 of the model containing just the feature X is no greater than the R2 of the model containing both X and X2 . 2. Describe the null hypotheses to which the p-values in Table 3.4 of the textbook correspond. Explain what conclusions you can draw based on these pvalues. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the linear model. 3. Consider a linear model with just one feature, Y = β0 + β1X + . Suppose we have n observations from this model, (x1, y1), . . . ,(xn, yn). The least squares estimator is given in (3.4) of the textbook. Furthermore, we saw 1 in class that if we construct a n × 2 matrix X˜ whose first column is a vector of 1’s and whose second column is a vector with elements x1, . . . , xn, and if we let y denote the vector with elements y1, . . . , yn, then the least squares estimator takes the form  βˆ 0 βˆ 1  =  X˜ TX˜ −1 X˜ T y. (1) Prove that (1) agrees with equation (3.4) of the textbook, i.e. βˆ 0 and βˆ 1 in (1) equal βˆ 0 and βˆ 1 in (3.4). 4. This question involves the use of multiple linear regression on the Auto data set, which is available as part of the ISLR library. (a) Use the lm() function to perform a multiple linear regression with mpg as the response and all other variables except name as the predictors. Use the summary() function to print the results. Comment on the output. For instance: i. Is there a relationship between the predictors and the response? ii. Which predictors appear to have a statistically significant relationship to the response? iii. Provide an interpretation for the coefficient associated with the variable year. Make sure that you treat the qualitative variable origin appropriately. (b) Try out some models to predict mpg using functions of the variable horsepower. Comment on the best model you obtain. Make a plot with horsepower on the x-axis and mpg on the y-axis that displays both the observations and the fitted function (i.e. ˆf(horsepower)). (c) Now fit a model to predict mpg using horsepower, origin, and an interaction between horsepower and origin. Make sure to treat the qualitative variable origin appropriately. Comment on your results. Provide a careful interpretation of each regression coefficient. 5. Consider fitting a model to predict credit card balance using income and student, where student is a qualitative variable that takes on one of three values: student∈ {graduate, undergraduate, not student}. (a) Encode the student variable using two dummy variables, one of which equals 1 if student=graduate (and 0 otherwise), and one of which equals 1 if student=undergraduate (and 0 otherwise). Write out an expression for a linear model to predict balance using income and student, using this coding of the dummy variables. Interpret the coefficients in this linear model. (b) Now encode the student variable using two dummy variables, one of which equals 1 if student=not student (and 0 otherwise), and one of which 2 equals 1 if student=graduate (and 0 otherwise). Write out an expression for a linear model to predict balance using income and student, using this coding of the dummy variables. Interpret the coefficients in this linear model. (c) Using the coding in (a), write out an expression for a linear model to predict balance using income, student, and an interaction between income and student. Interpret the coefficients in this model. (d) Using the coding in (b), write out an expression for a linear model to predict balance using income, student, and an interaction between income and student. Interpret the coefficients in this model. (e) Using simulated data for balance, income, and student, show that the fitted values (predictions) from the models in (a)–(d) do not depend on the coding of the dummy variables (i.e. the models in (a) and (b) yield the same fitted values, as do the models in (c) and (d)). 6. Extra Credit. Consider a linear model with just one feature, Y = β0 + β1X + , with E() = 0 and Var() = σ 2 . Suppose we have n observations from this model, (x1, y1), . . . ,(xn, yn). We assume that x1, . . . , xn are fixed, so the only randomness in the model comes from 1, . . . , n. Use (3.4) in the textbook — or, if you prefer, the matrix algebra formulation in (1) of this homework assignment — in order to derive the expressions for Var(βˆ 0) and Var(βˆ 1) given in (3.8) of the textbook.1. A random variable X has an Exponential(λ) distribution if its probability density function is of the form f(x) = ( λe−λx if x > 0 0 if x ≤ 0 , where λ > 0 is a parameter. Furthermore, the mean of an Exponential(λ) random variable is 1/λ. Now, consider a classification problem with K = 2 classes and a single feature X ∈ R. If an observation is in class 1 (i.e. Y = 1) then X ∼ Exponential(λ1). And if an observation is in class 2 (i.e. Y = 2) then X ∼ Exponential(λ2). Let π1 denote the probability that an observation is in class 1, and let π2 = 1 − π1. (a) Derive an expression for Pr(Y = 1 | X = x). Your answer should be in terms of x, λ1, λ2, π1, π2. (b) Write a simple expression for the Bayes classifier decision boundary, i.e., an expression for the set of x such that Pr(Y = 1 | X = x) = Pr(Y = 2 | X = x). (c) For part (c) only, suppose λ1 = 2, λ2 = 7, π1 = 0.5. Make a plot of feature space. Clearly label: i. the region of feature space corresponding to the Bayes classifier decision boundary, ii. the region of feature space for which the Bayes classifier will assign an observation to class 1, 1 iii. the region of feature space for which the Bayes classifier will assign an observation to class 2. (d) Now suppose that we observe n independent training observations, (x1, y1), . . . ,(xn, yn). Provide simple estimators for λ1, λ2, π1, π2, in terms of the training observations. (e) Given a test observation X = x0, provide an estimate of P(Y = 1 | X = x0). Your answer should be written only in terms of the n training observations (x1, y1), . . . ,(xn, yn), and the test observation x0, and not in terms of any unknown parameters. 2. We collect some data for students in a statistics class, with predictors X1 = number of lectures attended, X2 = average number of hours studied per week, and response Y = receive an A. We fit a logistic regression model, and get coefficient estimates βˆ 0, βˆ 1, βˆ 2. (a) Write out an expression for the probability that a student gets an A, as a function of the number of lectures she attended, and the average number of hours she studied per week. Your answer should be written in terms of X1, X2, βˆ 0, βˆ 1, βˆ 2. (b) Write out an expression for the minimum number of hours a student should study per week in order to have at least an 80% chance of getting an A. Your answer should be written in terms of X1, X2, βˆ 0, βˆ 1, βˆ 2. (c) Based on a student’s value of X1 and X2, her predicted probability of getting an A in this course is 60%. If she increases her studying by one hour per week, then what will be her predicted probability of getting an A in this course? 3. When the number of features p is large, there tends to be a deterioration in the performance of K-nearest neighbors (KNN) and other approaches that perform prediction using only observations that are near the test observation for which a prediction must be made. This phenomenon is known as the curse of dimensionality. We will now investigate this curse. (a) Suppose that we have a set of observations, each with measurements on p = 1 feature, X. We assume that X is uniformly distributed on [0, 1]. Associated with each observation is a response value. Suppose that we wish to predict a test observation’s response using only observations that are within 10% of the range of X closest to that test observation. For instance, in order to predict the response for a test observation with X = 0.6, we will use observations in the range [0.55, 0.65]. On average, what fraction of the available observations will we use to make the prediction? 2 (b) Now suppose that we have a set of observations, each with measurements on p = 2 features, X1 and X2. We assume that (X1, X2) are uniformly distributed on [0, 1] × [0, 1]. We wish to predict a test observation’s response using only observations that are within 10% of the range of X1 and within 10% of the range of X2 closest to that test observation. For instance, in order to predict the response for a test observation with X1 = 0.6 and X2 = 0.35, we will use observations in the range [0.55, 0.65] for X1 and in the range [0.3, 0.4] for X2. On average, what fraction of the available observations will we use to make the prediction? (c) Now suppose that we have a set of observations on p = 100 features. Again the observations are uniformly distributed on each feature, and again each feature ranges in value from 0 to 1. We wish to predict a test observation’s response using observations within the 10% of each feature’s range that is closest to that test observation. What fraction of the available observations will we use to make the prediction? (d) Using your answers to parts (a)-(c), argue that a drawback of KNN when p is large is that there are very few training observations “near” any given test observation. (e) Now suppose that we wish to make a prediction for a test observation by creating a p-dimensional hypercube centered around the test observation that contains, on average, 10% of the training observations. For p = 1, 2, and 100, what is the length of each side of the hypercube? Comment on your answer. Note: A hypercube is a generalization of a cube to an arbitrary number of dimensions. When p = 1, a hypercube is simply a line segment, when p = 2 it is a square. 4. Pick a data set of your choice. It can be chosen from the ISLR package (but not one of the data sets explored in the Chapter 4 lab, please!), or it can be another data set that you choose. Choose a binary qualitative variable in your data set to be the response, Y . (By binary qualitative variable, I mean a qualitative variable with K = 2 classes.) If your data set doesn’t have any binary qualitative variables, then you can create one (e.g. by dichotomizing a continuous variable: create a new variable that equals 1 or 0 depending on whether the continuous variable takes on values above or below its median). I suggest selecting a data set with n  p. (a) Describe the data. What are the values of n and p? What are you trying to predict, i.e. what is the meaning of Y ? What is the meaning of the features? (b) Split the data into a training set and a test set. Perform LDA on the training set in order to predict Y using the features. What is the training error of the model obtained? what is the test error? 3 (c) Perform QDA on the training set in order to predict Y using the features. What is the training error of the model obtained? what is the test error? (d) Perform logistic regression on the training set in order to predict Y using the features. What is the training error of the model obtained? what is the test error? (e) Perform KNN on the training set in order to predict Y using the features. What is the training error of the model obtained? what is the test error? (f) Comment on your results.1. Consider the validation set approach, with a 50/50 split into training and validation sets: (a) Suppose you perform the validation set approach twice, each time with a different random seed. What’s the probability that an observation, chosen at random, is in both of those training sets? (b) If you perform the validation set approach repeatedly, will you get the same result each time? Explain your answer. 2. Consider K-fold cross-validation: (a) Consider the observations in the 1st fold’s training set, and the observations in the 2nd fold’s training set. What’s the probability that an observation, chosen at random, is in both of those training sets? (b) If you perform K-fold CV repeatedly, will you get the same result each time? Explain your answer. 3. Now consider leave-one-out cross-validation: (a) Consider the observations in the 1st fold’s training set, and the observations in the 2nd fold’s training set. What’s the probability that an observation, chosen at random, is in both of those training sets? (b) If you perform leave-one-out cross-validation repeatedly, will you get the same result each time? Explain your answer. 1 4. Consider a very simple model, Y = β + , where Y is a scalar response variable, β ∈ R is an unknown parameter, and  is a noise term with E() = 0, V ar() = σ 2 . Our goal is to estimate β. Assume that we have n observations with uncorrelated errors. (a) Suppose that we perform least squares regression using all n observations. Prove that the least squares estimator, βˆ, equals 1 n Pn i=1 Yi . (b) Suppose that we perform least squares using all n observations. Prove that the least squares estimator, βˆ, has variance σ 2/n. (c) Consider the least squares estimator of β fit using just n/2 observations. What is the variance of this estimator? (d) Consider the least squares estimator of β fit using n(K − 1)/K observations, for some K > 2. What is the variance of this estimator? (e) Consider the least squares estimator of β fit using n − 1 observations. What is the variance of this estimator? (f) Derive an expression for E(βˆ), where βˆ is the least squares estimator fit using all n observations. (g) Using your results from the earlier sections of this question, argue that the validation set approach tends to over -estimate the expected test error. (h) Using your results from the earlier sections of this question, argue that leave-one-out cross-validation does not substantially over-estimate the expected test error, provided that n is large. (i) Using your results from the earlier sections of this question, argue that K-fold CV provides an over-estimate of the expected test error that is somewhere between the big over-estimate resulting from the validation set approach and the very mild over-estimate resulting from leave-one-out CV. 5. As in the previous problem, assume Y = β + , where Y is a scalar response variable, β ∈ R is an unknown parameter, and  is a noise term with E() = 0, V ar() = σ 2 . Our goal is to estimate β. Assume that we have n observations with uncorrelated errors. (a) Suppose that we perform K-fold cross-validation. What is the correlation between βˆ1 , the least squares estimator of β that we obtain from the 1st fold, and βˆ2 , the least squares estimator of β that we obtain from the 2nd fold? 2 (b) Suppose that we perform the validation set approach twice, each time using a different random seed. Assume further that exactly 0.25n observations overlap between the two training sets. What is the correlation between βˆ1 , the least squares estimator of β that we obtain the first time that we perform the validation set approach, and βˆ2 , the least squares estimator of β that we obtain the second time that we perform the validation set approach? (c) Now suppose that we perform leave-one-out cross-validation. What is the correlation between βˆ1 , the least squares estimator of βˆ that we obtain from the 1st fold, and βˆ2 , the least squares estimator of β that we obtain from the 2nd fold? Remark 1: Problem 5 indicates that the βˆ’s that you estimate using LOOCV are very correlated with each other. Remark 2: You might remember from an earlier stats class that if X1, . . . , Xn are uncorrelated with variance σ 2 and mean µ, then the variance of 1 n Pn i=1 Xi equals σ 2/n. But if Cor(Xi , Xk) = σ 2 , then the variance of 1 n Pn i=1 Xi is quite a bit higher. Remark 3: Together, problems 4 and 5 might give you some intuition for the following: LOOCV results in an approximately unbiased estimator of expected test error (if n is large), but this estimator has high variance. In contrast, Kfold CV results in an estimator of expected test error that has higher bias, but lower variance.1. In this exercise, you will generate simulated data, and will use this data to perform best subset selection. (a) Use the rnorm() function to generate a predictor X of length n = 100, and a noise vector  of length n = 100. (b) Generate a response vector Y of length n = 100 according to the model Y = 3 − 2X + X 2 + . (c) Use the regsubsets() function to perform best subset selection, considering X, X2 , . . . , X7 as candidate predictors. Make a plot like Figure 6.2 in the textbook. What is the overall best model according to Cp, BIC, and adjusted R2 ? Report the coefficients of the best model obtained. Comment on your results. (d) Repeat (c) using forward stepwise selection instead of best subset selection. (e) Repeat (c) using backward stepwise selection instead of best subset selection. Hint: You may need to use the data.frame() function to create a single data set containing both X and Y . 2. In class, we discussed the fact that if you choose a model using stepwise selection on a data set, and then fit the selected model using least squares on the same data set, then the resulting p-values output by R are highly misleading. We’ll now see this through simulation. 1 (a) Use the rnorm() function to generate vectors X1, X2, . . . , X100 and , each of length n = 1000. (Hint: use the matrix() function to create a 1000 × 100 data matrix.) (b) Generate data according to Y = β0 + β1X1 + . . . + β100X100 + , where β1 = . . . = β100 = 0. (c) Fit a least squares regression model to predict Y using X1, . . . , Xp. Make a histogram of the p-values associated with the null hypotheses H0j : βj = 0 for j = 1, . . . , 100. Hint: You can easily access these p-values using the command (summary(lm(y~X)))$coef[,4]. (d) Recall that under H0j : βj = 0, we expect the p-values to have a Unif[0, 1] distribution. In light of this fact, comment on your results in (c). Do any of the features appear to be significantly associated with the response? (e) Perform forward stepwise selection in order to identify M2, the best twovariable model. (For this problem, there is no need to calculate the best model Mk for k 6= 2.) Then fit a least squares regression model to the data, using just the features in M2. Comment on the p-values obtained for the coefficients. (f) Now generate another 1000 observations by repeating the procedure in (a) and (b). Using the new observations, fit a least squares linear model to predict Y using just the features in M2 calculated in (e). (Do not perform forward stepwise selection again using the new observations! Instead, take the M2 obtained earlier in this problem.) Comment on the p-values for the coefficients. How do they compare to the p-values in (e)? (g) Are the features in M2 significantly associated with the response? Justify your answer. THE BOTTOM LINE: If you showed a friend the p-values obtained in (e), without explaining that you obtained M2 by performing forward stepwise selection on this same data, then he or she might incorrectly conclude that the features in M2 are highly associated with the response. 3. Let’s consider doing least squares and ridge regression under a very simple setting, in which p = 1, and Pn i=1 yi = Pn i=1 xi = 0. We consider regression without an intercept. (It’s usually a bad idea to do regression without an intercept, but if our feature and response each have mean zero, then it is okay to do this!) (a) The least squares solution is the value of β ∈ R that minimizes Xn i=1 (yi − βxi) 2 . 2 Write out an analytical (closed-form) expression for this least squares solution. Your answer should be a function of x1, . . . , xn and y1, . . . , yn. Hint: Calculus!! (b) For a given value of λ, the ridge regression solution minimizes Xn i=1 (yi − βxi) 2 + λβ2 . Write out an analytical (closed-form) expression for the ridge regression solution, in terms of x1, . . . , xn and y1, . . . , yn and λ. (c) Suppose that the true data-generating model is Y = 3X + , where  has mean zero, and X is fixed (non-random). What is the expectation of the least squares estimator from (a)? Is it biased or unbiased? (d) Suppose again that the true data-generating model is Y = 3X + , where  has mean zero, and X is fixed (non-random). What is the expectation of the ridge regression estimator from (b)? Is it biased or unbiased? Explain how the bias changes as a function of λ. (e) Suppose that the true data-generating model is Y = 3X + , where  has mean zero and variance σ 2 , and X is fixed (non-random), and also Cov(i , i 0)= 0 for all i 6= i 0 . What is the variance of the least squares estimator from (a)? (f) Suppose that the true data-generating model is Y = 3X + , where  has mean zero and variance σ 2 , and X is fixed (non-random), and also Cov(i , i 0)= 0 for all i 6= i 0 . What is the variance of the ridge estimator from (b)? How does the variance change as a function of λ? (g) In light of your answers to parts (d) and (f), argue that λ in ridge regression allows us to control model complexity by trading off bias for variance. Hint: For this problem, you might want to brush up on some basic properties of means and variances! For instance, if Cov(Z, W) = 0, then V ar(Z + W) = V ar(Z) + V ar(W). And if a is a constant, then V ar(aW) = a 2V ar(W), and V ar(a + W) = V ar(W). 4. Suppose that you collect data to predict Y (height in inches) using X (weight in pounds). You fit a least squares model to the data, and you get Yˆ = 3.1 + 0.57X. (a) Suppose you decide that you want to measure weight in ounces instead of pounds. Write out the least squares model for predicting Y using X˜ (weight in ounces). (You should calculate the coefficient estimates explicitly.) Hint: there are 16 ounces in a pound! 3 (b) Consider fitting a least squares model to predict Y using X and X˜. Let β denote the coefficient for X in the least squares model, and let β˜ denote the coefficient for X˜. Argue that any equation of the form Yˆ = 3.1 + βX + β˜X, ˜ where β + 16β˜ = 0.57, is a valid least squares model. (c) Suppose that you use ridge regression to predict Y using X, using some value of λ, and obtain the fitted model Yˆ = 3.1 + 0.4X. Now consider fitting a ridge regression model to predict Y using X˜, again using that same value of λ. Will the coefficient of X˜ be equal to 0.4/16, greater than 0.4/16, or less than 0.4/16? Explain your answer. (d) For the same value of λ considered in (c), suppose you perform ridge regression to predict Y using X, and separately you perform ridge regression to predict Y using X˜. Which fitted model will have smaller residual sum of squares (on the training set)? Explain your answer. (e) Finally, suppose you use ridge regression to predict Y using X and X˜, using some value of λ (not necessarily the same value of λ used in (d)), and obtain the fitted model Yˆ = 3.17 + 0.03X + 0.03X. ˜ Is the following claim true or false? Explain your answer. Claim: Any equation of the form Yˆ = 3.17 + βX + β˜X, ˜ where β+16β˜ = 0.03+16×0.03 = 0.51, is a valid ridge regression solution for that value of λ. (f) Argue that your answers to the previous sub-problems support the following claim: Claim: least squares is scale-invariant, but ridge regression is not. 5. Suppose we wish to fit a linear regression model using least squares. Let MBSS k ,MFW D k ,MBW D k denote the best k-feature models in the best subset, forward stepwise, and backward stepwise selection procedures. (For notational details, see Algorithms 6.1, 6.2, and 6.3 of the textbook.) Recall that the training set residual sum of squares (or RSS for short) is defined as Pn i=1(yi − yˆi) 2 . For each claim, fill in the blank with one of the following: “less than”, “less than or equal to”, “greater than”, “greater than or equal to”, “equal to”. Say “not enough information to tell” if it is not possible to complete the sentence as given. Explain each of your answers. 4 (a) Claim: The RSS of MFW D 1 is the RSS of MBW D 1 . (b) Claim: The RSS of MFW D 0 is the RSS of MBW D 0 . (c) Claim: The RSS of MFW D 1 is the RSS of MBSS 1 . (d) Claim: The RSS of MFW D 2 is the RSS of MBSS 1 . (e) Claim: The RSS of MBW D 1 is the RSS of MBSS 1 . (f) Claim: The RSS of MBW D p is the RSS of MBSS p . (g) Claim: The RSS of MBW D p−1 is the RSS of MBSS p−1 . (h) Claim: The RSS of MBW D 4 is the RSS of MBSS 4 . (i) Claim: The RSS of MBW D 4 is the RSS of MFW D 4 . (j) Claim: The RSS of MBW D 4 is the RSS of MBW D 3 . 6. This problem is extra credit!!!! Let y denote an n-vector of response values, and let X denote an n × p design matrix. We can write the ridge regression problem as minimizeβ∈Rp  ky − Xβk 2 + λkβk 2, where we are omitting the intercept for convenience. Derive an analytical (closed-form) expression for the ridge regression estimator. Your answer should be a function of X, y, and λ.1. For this problem, you will analyze a data set of your choice, not taken from the ISLR package. I suggest choosing a data set that has p ≈ n or even p > n, since you will apply methods from Chapter 6 on this data. (a) Describe the data in words. Where did you get it from, and what is the data about? You will perform supervised learning on this data, so you must identify a response, Y , and features, X1, . . . , Xp. What are the values of n and p? Describe the response and the features (e.g. what are they measuring; are they quantitative or qualitative?). Plot some summary statistics of the data. (b) Split the data into a training set and a test set. What are the values of n and p on the training set? (c) Fit a linear model using least squares on the training set, and report the test error obtained. (d) Fit a ridge regression model on the training set, with λ chosen by crossvalidation. Report the test error obtained. (e) Fit a lasso model on the training set, with λ chosen by cross-validation. Report the test error obtained, along with the number of non-zero coefficient estimates. (f) Fit a principal components regression model on the training set, with M chosen by cross-validation. Report the test error obtained, along with the value of M selected by cross-validation. (g) Fit a partial least squares model on the training set, with M chosen by cross-validation. Report the test error obtained, along with the value of M selected by cross-validation. 1 (h) Comment on the results obtained. How accurately is the best model you obtained, in terms of test error? Is there much difference among the test errors resulting from these approaches? Which model do you prefer? 2. Define the basis functions b1(X) = I(−1 < X ≤ 1) − (2X − 1)I(1 < X ≤ 3), b2(X) = (X + 1)I(3 < X ≤ 5) − I(5 < X ≤ 6). We fit the linear regression model Y = β0 + β1b1(X) + β2b2(X) + , and obtain coefficient estimates βˆ 0 = 2, βˆ 1 = −1, βˆ 2 = 2. Sketch the estimated curve between X = −3 and X = 8. Note the intercepts, slopes, and other relevant information.1. For this problem, you will analyze a data set of your choice, not taken from the ISLR package. Choose a data set that has n  p, since you will apply methods from Chapter 7 to this data. You will also need to have p > 1. Throughout this problem, make sure to label your axes appropriately, and to include legends when needed. (a) Describe the data in words. Where did you get it from, and what is the data about? You will perform supervised learning on this data, so you must identify a response, Y , and features, X1, . . . , Xp. What are the values of n and p? Describe the response and the features (e.g. what are they measuring; are they quantitative or qualitative?). (b) Fit a generalized additive model, Y = f1(X1) + . . . + fp(Xp) + . Use cross-validation to choose the level of complexity. For j = 1, . . . , p, make a scatterplot of Xj against Y , and plot ˆfj (Xj ). Comment on your results and on the choices you made in fitting this model. (c) Now fit a linear model, Y = β0 + β1X1 + . . . + βpXp + . For j = 1, . . . , p, display the linear fit (Xjβˆ j ) on top of a scatterplot of Xj against Y . (d) Estimate the test error of the generalized additive model and the test error of the linear model. Comment on your results. Which approach gives a better fit to the data? 2. In this problem, we’ll play around with regression splines. (a) Generate data as follows: 1 set.seed(7) x

$25.00 View

[SOLVED] Comp 416 projects #1 to 3 solution

Introduction and Motivation: This project work is about the application layer of the network protocol stack. It involves application layer software development, client/server protocol, application layer protocol principles, socket programming, and multithreading. Through this project, you are going to develop a weather reporting network application (WeatNet) by interacting with the application layer abstract programming interface (API) of OpenWeatherMap (openweathermap.org). The project will require you to work with the following APIs for information extraction from the OWM web server: 1. Current Weather forecast 2. Daily forecast for 7 days 3. Basic Weather maps 4. Minute forecast for 1 hour 5. Historical Weather for 5 days This API provides you access to the weather data which will be subsequently accessed by the clients. Project Overview: In this project, you are asked to develop a weather reporting network application based on the client/server model. WeatNet server provides two types of TCP connections to interact with the clients: One connection for exchanging the protocol commands, and one for data transfers. Fig.1 shows connections for a sample WeatNet server and client interactions. As shown in this figure, the WeatNet server also takes the responsibility of interacting with a OpenWeatherNet web server using the OpenWeatherMap API. Figure 1. The OWM Client/Server connections. Implementation Details: The WeatNet reporting application has three main components: ● Interaction of the OpenWeatherMap (OWM) and the server ● Server side of the application ● Client side of the application It is pertinent to note that the free subscription for OWM allows for up to 60 API calls per minute and a total of 1 million calls in a month. The developed application and testing including for demonstration must keep these limits in perspective. Phases: This project has two phases: Authentication phase, and querying phase. During the authentication phase, the client provides its username and answers a series of secret questions to prove its identity. The protocol (the message flow, message types and their formats) is provided in the “Authentication” section. The groups need only to implement the provided protocol. During the querying phase, the authenticated clients communicate with the server to retrieve weather information conforming to the specifications that will be explained in the following sections. Unlike the authentication phase, we do not provide you a protocol but expect you to design your own. This will be a part of your report. You may use the authentication protocol as a starting point and build from there. City List: The WeatNet reporting will be done for the following cities: 1. {‘id’: 745044, ‘name’: ‘Istanbul’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 28.949659, ‘lat’: 41.01384}} 2. {‘id’: 740264, ‘name’: ‘Samsun’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 36.330002, ‘lat’: 41.286671}} 3. {‘id’: 315201, ‘name’: ‘Eskişehir’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 31.16667, ‘lat’: 39.666672}} 4. {‘id’: 323784, ‘name’: ‘Ankara’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 32.833328, ‘lat’: 39.916672}} 5. {‘id’: 304919, ‘name’: ‘Malatya’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 38.0, ‘lat’: 38.5}} 6. {‘id’: 750268, ‘name’: ‘Bursa’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 29.08333, ‘lat’: 40.166672}} 7. {‘id’: 311044, ‘name’: ‘İzmir’, ‘state’: ”, ‘country’: ‘TR’, ‘coord’: {‘lon’: 27.092291, ‘lat’: 38.462189}} You can find the entire list of cities supported by OWM here. Please note that OWM may use multiple coordinates within the same city. The developed application must ensure that even if the names may be duplicated, the coordinates must be for the cities provided for the list. In this case, the city IDs will be useful when using the OWM API. WeatNet Server side overview: The server for the WeatNet application should: 1) Establish a connection with the OpenWeatherMap (OWM) web server using the API and download the specified weather metrics for all the cities specified for this project. 2) Authenticate any client before initiating any data exchange. 3) Allow multiple clients to connect using multi-threading with the same functionality. 4) Add a timestamp in each file before sending to any client while providing hashes of the files for error detection purposes.. The metrics categories for which the OWM API will be used are: 1. Current weather forecast 2. Weather triggers 3. Basic weather maps 4. Minute forecast for 1 hour 6. Historical weather for 5 days All these metrics except the weather maps are to be downloaded in the JSON format as separate files whereas the basic maps will be downloaded as images (.jpg, .png etc.). API Interactions: For the interactions to take place, the client will pass the name/ID of the city and/or the weather triggers. All this information shall be conveyed between the client and the server on the Command socket (the socket allocated to the client after acceptance and validation). Based on this input, the server will parse the client input and use the API to extract the relevant information from the web server. This information will then be passed to the client in the form of a JSON/image file over the Data Socket. The server will be responsible for generating a hash value of each file before transmitting to the client and this will also be sent to the client for file verification over the Command Socket The process at server side would be: 1) Create welcoming socket 2) Accept incoming client connections at the welcoming socket, simultaneously if needed using multi-threading. 3) Authenticate any incoming client connection requests after acceptance. Create an additional Data socket with the client if authenticated. Terminate connection if authentication fails. (Demonstration should present both scenarios in which clients are validated as well as rejected). 4) Decipher the requests coming in from the clients and download the required files from the web server. (The protocol for forwarding requests has to be developed by the groups themselves and explained in their reports. A similar format is provided in the Authentication part). 5) Generate the hash values of the file requested by the client. 6) Send the hash value of the required file over the Command socket. 7) Send the requisite file over the Data socket. 8) Terminate connection if a “file received” acknowledgment message is received from the client and no other files are requested within timeout duration. The exact implementation of Data Socket is left to the choice of the groups. It can be a single socket shared by the clients through multi-threading or a dedicated socket for each client. The initiation of the Data Socket and the exact nature of how the parameters are passed to the client are also to be decided by the groups. The parameters of the Data socket are also to be conveyed over the Command socket if required. Weatnet Client Side Overview: This weather application envisions multiple clients interacting with a single server. For this application, the clients should: 1) Confirm its authenticity with the server 2) Be able to submit requests to the server. 3) Be able to receive data in form of JSON/image files from the server over the Data socket. 4) Verify the file based on its digital signature. 5) Display the JSON data in tabular form or display the image on the client side Client-Server Interaction The client side must take care of the parameters to be passed once requesting any metric. The process at the client side will follow the given steps: 1) Initiate connection with the server over Command Socket 2) Authenticate based on the server requirements. 3) Receive the parameters for Data Socket and connect to that after authentication. 4) Pass the requests to the server 5) Receive the hash value of the file on the Command socket 6) Receive the files over the Data socket 7) Confirm if the hash value corresponds to the file 8) Request retransmit from the server if mismatch between hash value and file or failure to receive file (Relevant String should be displayed in terminal). A scenario for this step may be specifically designed for demonstration purposes.) 9) Display the files in the appropriate manner. 10) Terminate the connection in case an appropriate file is received and no other request forwarded within the timeout duration. (Appropriate tests for demonstration purposes should be developed.) Authentication Phase In our implementation, the server does not share the weather information with everyone. The client needs to be authenticated in order to be able to query the server, which will be done by a series of “challenges” (i.e., secret questions) instead of simple password-based authentication. After the authentication is done, the client will receive a “token” from the server. The token will act as a “proof” of the authenticity of the client. From that point on, the client will need to append this token to its requests, and once the server receives a request, it responds only if the token is valid. Please note that the authenticated clients are authorized to perform all possible weather queries, so we do not distinguish between authentication and authorization for the sake of simplicity. Fig. 2. The message flows for the authentication phase. First, the client will send its username to request to be authenticated (i.e., acquire a token). Then, the server will authenticate the client by requesting the answers for a series of secret questions that are included in Auth_Challenge messages. After receiving a question, the client will prompt the user, the user will enter the answer through standard input, and the client will send the answer by including it into an Auth_Request message. Some example questions and answers: – “What is your favorite color?” – “red” – “What is the first name of your favorite author?” – “kadri” – “In which city you were born?” – “istanbul” – “What is the last name of your best friend?” – “zorlu” – “What is your goal in this course?” – “to get an A” The server decides how many questions the client should answer, and the questions will be chosen from a pool of possible questions. The correct answers to these questions for the particular user will be known by the server. If all the questions were answered correctly by the client, the server will send back an Auth_Success, including a unique token for the client. If the client answers a question incorrectly, the server will immediately respond with an Auth_Fail with the reason of failure message as “Incorrect answer”. Please note that the server may also send back an Auth_Fail when the first Auth_Request includes a nonexistent username. In this case, the reason for failure should be “User does not exist”. Figure 1 illustrates the intended message flow, where qi denotes the i th question and ai denotes the client’s answer to qi . a. Message format In protocol design, (1) the message types, and (2) the format & the meaning of the values in each type of message must be clearly specified. The client and the server will handle the received messages according to their types. Similarly, they will construct the messages adhering to the protocol. Message types During the authentication phase, the client is able to send only Auth_Request messages, and the server is able to send Auth_Challenge, Auth_Fail and Auth_Success messages. Message type Value Payload Auth_Request 0 Username/Answer (String) Auth_Challenge 1 Question (String) Auth_Fail 2 Reason of failure (String) Auth_Success 3 Token (String) Deconstructing the TCP data The TCP data received from the socket must be deconstructed correctly. The first six bytes are designated as the application header, where the first byte represents the “phase” (either the authentication (0) or querying (1) phase.), and the second byte represents the “type” of the message. If the “phase” byte is set to 0, then the messages will be handled by the authentication module of our implementation. Otherwise, your implementation should hand over the handling of the request to the weather querying module. The remaining four bytes of the header are designated as an integer (4 bytes) denoting the length of the payload in bytes. Your application should use this value to read the correct number of bytes from the TCP stream as the payload, since we do not know the length of the payload beforehand. Figure 2 shows the deconstruction of an authentication message. Here are some useful links: https://docs.oracle.com/javase/7/docs/api/java/io/DataInputStream.html https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html https://docs.oracle.com/javase/7/docs/api/java/io/FilterOutputStream.html Fig. 3. Deconstructing the TCP data. b. Timeout mechanism You also need to handle the cases where the client is unresponsive to a question. After sending a challenge, the server should wait only for a predetermined amount of time (e.g. 10 seconds) before sending an Auth_Fail to the client with the appropriate reason message and closing the connection. c. Implementation details Storing users, questions, and answers For simplicity, you may want to keep all the users, questions and answers in an easily parsable text file. Figure 3 illustrates an example of such a file, where we have two users – “ali” and “veli” with different questions and answers. Fig. 4. An example of a text file storing users, questions and answers. The token The token should be unique for each session. Ideally, they should be constructed on-the-fly. You can construct a token by performing a hash on the concatenation of the username and a random number. Then, the token will be the first n (e.g., 6) characters of the output. After that, you should save the token along with the corresponding client IP + port and username, so that the server can verify the token in the future during the querying phase. Architecture While architecturing your application, it may be beneficial to separate it into modules/layers so that different group members can independently work on a single module. For example, the authentication module of the server and client would communicate with each other (through the command socket) to agree on a token in the authentication phase. In the querying phase, the only responsibility of the authentication module would be as following: – At the client: Append the token to the message received from the weather module before sending the message through TCP to the server. – At the server: Verify the received token appended to the received message from the TCP before supplying it to the weather module. Then, the weather module could simply focus on communicating with the OWM API. Figure 4 shows an example of how the modules in your project may interact. The top section represents the authentication phase, while the bottom section represents an authenticated client sending a request to the server. Fig. 5. The interaction of the different modules of WeatNet. Execution Scenario Due to the fact that the classes are being conducted online, the client and server are expected to reside on a single machine for simplicity of both execution and demonstration. However, any group preferring connection to a Remote device is free to do so albeit with necessary consideration given to the overall execution and demonstration scenarios. The project envisions the groups to implement at least five unique clients and one server. Project deliverables: You should submit your source code and project report (in a single .rar or .zip file) via Blackboard. Report The report should start with a very brief description of the project. This should be followed by an explanation of the philosophies especially post-authentication server-client protocol. This should be followed by an overview of the programming of the client and the server side including the initial authentication, connection with the API, the file transfer mechanism and the file verification. Instead of attaching complete code, it is strongly recommended to attach code Snippets and images of the results of their execution. The report should explicitly describe the test routines developed to evaluate the full range of features required for the WeatNet. Source Code: A .zip or .rar file that contains your implementation in a single Eclipse or Intellij IDEA. If you aim to implement your project in any IDE rather than the mentioned one, you should first consult with TA and get confirmation. ● Source Code: A .zip or .rar file that contains your implementation in a single Eclipse or Intellij IDEA. If you aim to implement your project in any IDE rather than the mentioned one, you should first consult with TA and get confirmation. ● The report is an important part of your project presentation, which should be submitted as both a .pdf and Word file. Your report should show the step by step OWM configuration and connection with your code. As well as your server-client communication initiation. Report acts as a proof of work for you to assert your contributions to this project. Everyone who reads your report should be able to reproduce the parts we asked to document without hard effort and any other external resource. If you need to put the code in your report, segment it as small as possible (i.e. just the parts you need to explain) and clarify each segment before heading to the next segment. For codes, you should be taking screenshot instead of copy/pasting the direct code. Strictly avoid replicating the code as whole in the report, or leaving code unexplained. You are expected to provide detailed explanations of the following clearly in your report: Demonstration: You are required to demonstrate the execution of WeatNet Application for the defined requirements. Your demo sessions will be announced by the TAs. Attending the demo session is required for your project to be graded. All group members are supposed to be available during the demo session. The on time attendance of all group members to the demo session is considered as a grading criteria. During your demonstration of the authentication phase, you will be asked to demonstrate two clients being authenticated at the same time. You can assume that all the clients will present different usernames. You need to make sure that all the users have different correct answers for each of the possible questions. The clients and the servers will be running on the same machine, so different clients should be running at different ports. In this vein, the server will differentiate the clients not only by their IP address, but also with their port number. For the demonstration, the server should send three different questions before authenticating the user. First, you will need to show an unsuccessful authentication, then you will need to show a successful authentication. During the demonstration, the group will be asked to display all the operations of the application starting from initiating client connections and authentication to passing requests for the listed metrics. It is strongly recommended that the appropriate test routines may be developed to present an effective demonstration and display the full range of features as defined for WeatNet application. The groups have the creative freedom to present any additional features they have built into their application in any way they deem feasible. However, please note that you will be given 10-15 minutes for your demonstration. Following is a detailed but not exhaustive list of the test routines the groups should develop to aid them in testing and demonstration: 1. Client creation. The application should be designed with client scalability in mind even though the testing would be done with max 5 clients. 2. Single client connection with the server 3. Multiple clients simultaneously connecting with the server 4. Authentication procedure for a single randomly chosen client 5. The process from initiating a request by the client side to the final verification of the received file at the client side and file display. 6. Process in case of a mismatch between received file and received hash. Suggested task distribution: We recommend you work in a group of 3 students, and suggest the following task distribution accordingly: ● Student 1: Client-side programming and Authentication. ● Student 2: Server-side programming and multi-threading. ● Student 3: OpenWeatherMap API programming and file transfer mechanism. ● All members of the group to perform integration, tests and report. The groups should be clear on the task distribution and the relevant questions will be directed towards the student responsible for the task. Good Luck!This project is about the transport layer of the network protocol stack. The focus is on the SSL, TCP and UDP protocols. For this purpose, you are asked to modify the provided SSL client/server codes as specified below, experiment with TCP and UDP features and implement a Stop-and-Wait ARQ protocol. You are asked to use the WireShark network protocol analyzer tool to answer transport layer related questions. WireShark is the world’s foremost network protocol analyzer, and is the de facto standard across many industries and educational institutions. It can be downloaded freely at https://www.wireshark.org/download.html. WireShark allows users to trace the network activity by capturing all the packets that hit your network interface. It tags the information of each layer by parsing the given byte stream according to the corresponding protocol. You should read this project document carefully before starting your tasks. Part 1 – SSL Implementation and Experiments: Figure 1 illustrates an overview of SSL protocol. Recall that, you are provided an SSL client/server code that performs echoe on top of an SSL socket. The corresponding SSL practical content codes and slides are available through the course web site. Figure 1. SSL Protocol Overview As presented in the figure, the certificate is sent by the server to the user at the start of the session. The user adds this certificate to the local key store and uses it for authentication. The code is provided to add the certificate to the local key store, but the part where the server sends the certificate to the user is missing. In this part, you are asked to modify the provided code as follows: ▪ Set up a TCP connection on which the certificate will be transferred to the client. TCP connection can listen to any port. The server should ask for client verification before the certificate is transferred to the client. You can keep an already known users file on the server side and use them for login. ▪ Use the certificate to connect to the server through SSL. You should keep the certificate in the right directory. ▪ SSL connection at the server side should listen to the port with your KUSIS-ID + DD from your date of birth (DDMMYY) number (look at the first question). You may handle the case where the number becomes larger than the available ports by any mathematical manipulation and explain the answer in the report. ▪ After SSL connection is established, your client should receive your KUSIS-username (e.g. abcdef18) + KUSIS-ID character by character in separate messages in a non-persistent manner. Then, the KUSIS-username + KUSIS-ID should be printed at the client side. Important Notes: – Your modified SSL codes must be submitted along with your project report. In your project report, you should explain your answers and provide your WireShark outputs for each question, in order to get credit. – On Windows, you might not be able to capture the Loopback interface, which is the traffic inside your operating system. When you run your server and client software in a single operating system, in order to capture incoming and outgoing packets, you need to capture the Loopback interface. To solve this problem, you can run one of the applications (client or server) in a separate machine. After running your code, answer the following questions: 1. Locate the SSL Server IP address and port number, client IP address and port number through which these agents are communicating by using Wireshark. 2. Locate the data containing TCP segments. What is written in the data field? Compare it with the data you exchanged between the client and server. Why do you think is this the case? 3. How many TCP segments are transmitted in total while your KUSIS username + KUSIS ID is exchanged one by one with non-persistent connections? 4. What difference did you see in the payload of SSL and TCP? Can you locate the login name and password entered by the user? Can you locate the email information? Part 2 – TCP Experiments: Before beginning the exploration of TCP, you need to use Wireshark to obtain a packet trace of the TCP transfer of a file from your computer to a remote server. You need to run Wireshark before starting this process to obtain the trace of the TCP segments sent and received from your computer. You are asked to do so by accessing a Web page that will allow you to enter the name of a file stored on your computer (which contains the ASCII text of Alice in Wonderland), and then transfer the file to a Web server using the HTTP POST method. We are using the POST method rather than the GET method as we would like to transfer a large amount of data from your computer to another computer. Perform the following: ▪ Run Wireshark and start capturing the traffic. ▪ Start up your web browser. Go the https://gaia.cs.umass.edu/wireshark-labs/alice.txt and retrieve an ASCII copy of Alice in Wonderland. Store this file somewhere on your computer. ▪ Next go to https://gaia.cs.umass.edu/wireshark-labs/TCP-wireshark-file1.html ▪ You should see a screen that looks like: ▪ Use the Browse button in this form to enter the name of the file (full path name) on your computer containing Alice in Wonderland (or do so manually). Don’t yet press the “Upload alice.txt file” button. ▪ Now start up Wireshark and begin packet capture (Capture->Start) and then press OK on the Wireshark Packet Capture Options screen (we will not need to select any options here). ▪ Returning to your browser, press the “Upload alice.txt file” button to upload the file to the gaia.cs.umass.edu.server. Once the file has been uploaded, a short congratulations message will be displayed in your browser window. ▪ Stop Wireshark packet capture. Answer the following questions for the TCP segments: 5. Obtain the Flow Graph of the TCP communication. What is the significance of the various IP addresses shown in the Flow Graph? Using the flow graph, identify the three-way handshake and terminating handshake messages for the TCP connection. Provide screenshots for each explanation. 6. What are the sequence numbers (which appear in the Wireshark program) of the segments used for the 3-way handshake protocol that initiates the first TCP connection? What are the sequence numbers of those segments and the port numbers used on client and server sides? 7. What is the sequence number of the SYNACK segment sent by gaia.cs.umass.edu to the client computer in reply to the SYN? 8. What is the value of the Acknowledgement field in the SYNACK segment? How did gaia.cs.umass.edu determine that value? What is it in the segment that identifies the segment as a SYNACK segment? 9. What is the sequence number of the TCP segment containing the HTTP POST command? Note that in order to find the POST command, you’ll need to dig into the packet content field at the bottom of the Wireshark window, looking for a segment with a “POST” within its DATA field. 10.Consider the TCP segments containing the HTTP POST as the last segment in the TCP connection. What are the sequence numbers of the first six segments in the TCP connection? At what time was each segment sent? When was the ACK for each segment received? Given the difference between when each TCP segment was sent, and when its acknowledgement was received, what is the RTT value for each of the six segments? What is the EstimatedRTT value (see Section 3.5.3 in the textbook) after the receipt of each ACK? Assume that the value of the EstimatedRTT is equal to the measured RTT for the first segment, and then is computed using the EstimatedRTT equation (Section 3.5.3 in the textbook) for all subsequent segments. Note: Wireshark has a nice feature that allows you to plot the RTT for each of the TCP segments sent. Select a TCP segment in the “listing of captured packets” window that is being sent from the client to the gaia.cs.umass.edu server. Then select: Statistics->TCP Stream Graph- >Round Trip Time Graph. Part 3 – UDP Experiments: In this part, you are assigned a unique URL to work with. The list is provided in file Project2_URL_List.pdf, and you must use the URL assigned you. You should provide the appropriate screenshots and work on the correct domain, in order to get credit. nslookup command works as an IP address resolver. When you provide a domain name as argument, it will return the IP address of that domain. Now, take the following steps provided and answer the questions accordingly. ▪ Start your Wireshark software and start capturing packets from the appropriate interface. ▪ Use nslookup command in order to resolve the IP address of the URL that is assigned to you. ▪ Stop packet capturing in the Wireshark. ▪ Apply an appropriate display filter. 11.What display filter did you apply in order to see appropriate packets? 12.Which application layer and transport layer protocol do nslookup work on? What is the reason that transport layer protocol is chosen? 13.Can you derive the local DNS server you connected work in iterative or recursive manner? If you can or cannot please provide a detailed explanation. Please also briefly explain the advantages and disadvantages of iterative and recursive approach over each other. 14.What are the header lengths of application layer protocol and transport layer protocol that nslookup works on? 15.How many checksums does an UDP segment have in the checksum field? Why? Part 4.a – Stop-and-Wait ARQ Protocol Recall the Stop-and-Wait ARQ protocols you have seen in the lectures. For this part, you are to implement a Stop-and-Wait ARQ protocol at the transport layer. We have already given you a base code in Java. You must implement your methods over this code. After your implementation is complete, you can run the main method of Main.java under the main package to see whether your code passes the tests. If you are having problems, you may set Main.DEBUG to true to see some more information on what is happening behind the scenes. You should see the following output when your implementation is completed successfully: Fig 1. The output you should be getting. You only need to implement two methods residing in Transport.java under the transport package: ● void sendWithARQ(Packet[] packets) Used by the sender. Gets an ordered list of message packets and sends them one by one to the receiver, using Stop and Wait ARQ. ● Packet[] receiveWithARQ() Used by the receiver. Receives the message packets sent by the sender using Stop and Wait ARQ and returns them as an array. While implementing your methods, you will need to use the methods and classes that are already provided to you. First, take a look at Packet.java under the network package. A packet can either be a message packet or an acknowledgement packet. Here are the important fields that you should be familiar with in your implementation: ack, lastPacket, sequenceNumber, characters, timedOut. The methods that you should be using are in the Transport.java under the transport package. Here are the explanations: ● Packet receivePacket(int timeout) Receives a packet from the other process. If the timeout parameter is given as > 0, the process waits timeout milliseconds for a packet before failing. Upon timeout failure, this method will return an empty packet with its timedOut flag set to true. If the timeout parameter is given as ≤ 0, this method blocks infinitely. ● void sendMsgPacket(int sequenceNumber, boolean lastPacket, Packet packet) Sends a single message packet to the other process. The sequenceNumber parameter is either 0 or 1, and it denotes the sequence number of the message packet that you are sending. The lastPacket parameter should be set to true only when you are sending the last message packet to the receiver. The packet parameter is the actual message packet you are trying to send. ● void sendAckPacket(int sequenceNumber, boolean lastAck) Sends an acknowledgement packet to the other process. The sequenceNumber parameter is either 0 or 1, denoting the sequence number of the acknowledgement packet. The lastAck parameter should be set to true only when this acknowledgement message is being sent in response to the last message packet received from the sender. In conclusion, your implementation should do the following: ● At the sender (i.e., sendWithARQ method): 1. Sender must send each message packet using sendMsgPacket, setting the arguments correctly. 2. After a packet is sent, the sender must wait for an acknowledgement from the receiver using receivePacket with a reasonable timeout of your choice. 3. If the acknowledgement is not received within this duration, the sender must retransmit the packet. 4. If the acknowledgement was received within this duration and the sequence number is as expected, the sender must proceed to send the next packet. ● At the receiver receiveWithARQ: 1. Receiver must wait for packets using receivePacket. 2. Once a packet is received, the receiver should check the sequence number (using Packet.sequenceNumber) and discard the packet or save it depending on its sequence number. 3. Receiver should also check Packet.lastPacket field of the received message packet. ▪ If the received packet is not the last message packet: The receiver must send a single acknowledgement packet with the correct sequence number using sendAck, and then wait for the next message packet. ▪ If the received packet is the last message packet: If this is the last packet, we will send our last acknowledgement using sendAck and break out of our loop. Part 4.b – Analyzing Your Implementation After your implementation is complete, please answer the following questions in your report. 16.Start listening on your Loopback interface with Wireshark and run your code. How many retransmissions have occurred from the sender to the receiver? Explain. 17.Attach a screenshot of the first packet sent by the sender and the last packet received by the receiver. How long did it take to transfer all the packets? Consider the timeout period that you have chosen and the amount of retransmissions. Does this result make sense? Why or why not? Note: You will not see the “lost packets” on Wireshark. Project Deliverables: Important Note: You are expected to submit a project report, in PDF format, that documents and explains all the steps you have performed in order to achieve the assigned tasks of the project. A full grade report is one that clearly details and illustrates the execution of the project. Anyone who follows your report should be able to reproduce your performed tasks without effort. Use screenshots to illustrate the steps and provide clear and precise textual descriptions as well. All reports would be analyzed for plagiarism. Please be aware of the KU Statement on Academic Honesty. The name of your project .zip file must be -.zip You should turn in a single .zip file including: ▪ Source codes: Containing the source codes of client and server, and also your completed version of part 4 code. ▪ Project.pdf file (Your report, should include answers and the corresponding Wireshark screenshots) ▪ Saved capture files from the Wireshark. Figures in your report should be scaled to be visible and clear enough. All figures should have captions, should be numbered according to their order of appearance in the report, and should be referenced and described clearly in your text. All pages should be numbered, and have headers the same as your file naming criteria. If you employ any (online) resources in this project, you must reference them in your report. There is no page limit for your report, and no specific requirements on the design. Good Luck!This project is about the network layer of the Internet protocol stack. The objectives are to examine the network layer data, the principles behind network layer services, and routing (path selection). Through this project, you will practice with Wireshark as well as a simplified network routing simulator. The first part of the project requires you to analyze traffic data at the network layer through Wireshark. ICMP traffic will be monitored resulting from the application of a) ping and b) traceroute commands. The second part of the project involves working with the provided routing simulator and implementing routing strategies to analyze their performance. Part I: ICMP Analysis ICMP is a companion protocol to IP that helps IP to perform its functions by handling various error and test cases. The Internet Control Message Protocol (ICMP) is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating, for example, that a requested service is not available or that a host or router could not be reached. ICMP differs from transport protocols such as TCP and UDP in that it is not typically used to exchange data between systems, nor is it regularly employed by end-user network applications (with the exception of some diagnostic tools like ping and traceroute). 1.a: Ping Analysis For this part of the project, you will use the ping command to analyze working of ICMP. ping uses the ICMP protocol’s mandatory ECHO_REQUEST datagram to elicit an ICMP ECHO_RESPONSE from a host or gateway. ECHO_REQUEST datagrams (“pings”) have an IP and ICMP header, followed by a struct timeval and then an arbitrary number of “pad” bytes used to fill out the packet. ping works with both IPv4 and IPv6. Using only one of them explicitly can be enforced by specifying -4 or -6. ● Run the Wireshark packet capture. ● Ping the hostname URL assigned to you. ● You are required to send exactly 5 ping messages at 5 different times in a day or duration of the project and share the screenshot of the command prompt with a summary of ping completion and associated statistics. ● After the completion of ping, stop capture and answer the following questions. Remember to attach screenshots with each answer highlighting the relevant area. 1. What are the three layers in the ICMP packet? 2. What is TTL and its significance? Which layer does it reside in and is it constant ( format and no. of bits wise) across IPV4 and IPV6 ping commands? 3. Why is it that an ICMP packet does not have source and destination port numbers? 4. What is the length of the datafied of the ICMP part Type -8? Elaborate on the structure of the datafield citing any correspondingly common and changing parts across various messages? If part of the data field, what do you think is the reason for that? 5. Find the minimum TTL below which the ping messages do not reach your particular URL destination. 6. How do the Identifier and Sequence Number compare for successive echo request packets? 1.b : Traceroute Analysis In this part, you will use traceroute (may need to be installed) to perform the same set of actions as in Part-1.a. Traceroute is implemented in different ways in Unix/Linux/MacOS and in Windows. In Unix/Linux, the source sends a series of UDP packets to the target destination using an unlikely destination port number; in Windows, the source sends a series of ICMP packets to the target destination. For both operating systems, the program sends the first packet with TTL=1, the second packet with TTL=2, and so on. Recall that a router will decrement a packet’s TTL value as the packet passes through the router. When a packet arrives at a router with TTL=1, the router sends an ICMP error packet back to the source. In the following, we’ll use the native Windows tracert program. A shareware version of a much nicer Windows Traceroute program is pingplotter (www.pingplotter.com). The source and destination IP addresses in an IP packet denote the endpoints of an Internet path, not the IP routers on the network path the packet travels from the source to the destination. Traceroute is a utility for discovering this path. It works by eliciting ICMP TTL Exceeded responses from the router 1 hop away from the source towards the destination, then 2 hops away from the source, then 3 hops, and so forth until the destination is reached. The responses will identify the IP address of the router. Since traceroute takes advantage of common router implementations, there is no guarantee that it will work for all routers along the path, and it is usual to see “ * ” responses when it fails for some portions of the path. ● Start up the Wireshark packet sniffer, and begin Wireshark packet capture. ● Use traceroute with the same URL. (Note that on a Windows machine, the command is “tracert” and not “traceroute”.) ● On Linux, force the traceroute command to send ICMP packets instead of the UDP packets. You may look for this information using ‘man traceroute’ and choosing the appropriate flag. ● When the Traceroute program terminates, stop packet capture in Wireshark. At the end of the experiment, your Command Prompt Window should show that for each TTL value, the source program sends three probe packets. Traceroute displays the RTTs for each of the probe packets, as well as the IP address (and possibly the name) of the router that returned the ICMP TTL-exceeded message. 7. How long is the ICMP header of a TTL Exceeded packet? Select different parts of the header in Wireshark to see how they correspond to the bytes in the packet. 8. How does your computer (the source) learn the IP address of a router along the path from a TTL exceeded packet? 9. How many times is each router along the path probed by traceroute? 10. Within the tracert measurements, is there a link whose delay is significantly longer than others? The echo request packets sent by traceroute are probing successively more distant routers along the path. You can look at these packets and see how they differ when they elicit responses from different routers. Part II: Routing Implementation In this part, you are asked to implement greedy routing algorithms at the network control plane and answer related questions. You are provided with a simulator and two topologies. You will need to implement four different algorithms and observe their behavior under the given topologies. The topologies given to you can be visualized as following: Fig 1. Topology 1 visualized. Fig 2. Topology 2 visualized. The numbers on the nodes denote the address of the node (i.e., router), and the numbers on the edges denote the cost of that link. Please note that the graph is undirected. For example, node 2 has a link to node 3 with a cost of 1 and vice-versa. In our simulation, node 1 tries to send a packet to node 4. Simulator Explanation The simulator given to you can be executed by running the Main.java file. The outputs will be helpful to you in your implementation and answering the questions. The expected simulator outputs are given in the “expected_output.txt” file. The algorithms that you need to implement reside at the “algorithms” package. You will only need to implement the selectNeighbors method for each algorithm. Each node stores an instance of an Algorithm and invokes selectNeighbors method when there is a new packet to forward. The output of this method determines to which neighbors the packets will be forwarded to. Briefly, selectNeighbors takes the following parameters: ● Origin: The address of the origin of the packet. ● Destination: The address of the destination of the packet. ● PreviousHop: The address of the previous hop (i.e., the node that the packet was sent from.) ● Neighbors: A list of NeighborInfo instances that contain the neighbors that the node has. Your implementation should select a subset of Neighbors and return them. The node that invoked the algorithm will route the packets to the list of neighbors that you have returned. NaiveFloodingAlgorithm: Go to NaiveFloodingAlgorithm class residing under the “algorithms” package. This algorithm is already given to you as an example. A node using this algorithm simply routes a packet to all of its neighbors (except the previous hop.) Run the simulator and observe the behavior of this algorithm. Then, answer the following questions in your report: 11. See that this algorithm succeeds in topology 1. What does the total communication cost represent, and why is it different from the path cost? 12. See that this algorithm fails in topology 2, and the simulator notes that the protocol does not converge. Why is this the case? FloodingAlgorithm: FloodingAlgorithm is a simple improvement over the NaiveFloodingAlgorithm: Each node in the topology only routes once. You can maintain a state, and return the neighbors only when selectNeighbors is called the first time. Go to FloodingAlgorithm class and implement it. Then, answer the following questions in your report: 13. Consider the path taken by the packet in topology 2 with this algorithm. Is this what you have expected? Why or why not? NaiveMinimumCostAlgorithm: NaiveMinimumCostAlgorithm always chooses the link with the smallest cost. This algorithm only returns a single neighbor (i.e., returns a list with only one neighbor,) as opposed to the flooding algorithms. While finding the appropriate neighbor, make sure that you do not consider the previous hop. Go to NaiveMinimumCostAlgorithm and implement it. Then, answer the following questions in your report: 14. If you implemented the algorithm as specified, it should succeed in topology 1 but fail in topology 2. Why does it fail in topology 2? MinimumCostAlgorithm: MinimumCostAlgorithm is an improvement over NaiveMinimumCostAlgorithm. More specifically, now we only consider the “edges” that were not previously used. This algorithm should maintain an exclusion set, i.e., a set of neighbors that should be excluded from routing to prevent cycles. When choosing the link with the minimum cost, we find the minimum of the neighbors that are not in the exclusion set. If all of the neighbors are already in the exclusion set, we simply choose a random neighbor. Please note that at the end of each call to selectNeighbors, two nodes should be added to the exclusion set: (1) the node that the packet was received from, (2) the node that the packet is being forwarded to. Go to MinimumCostAlgorithm and implement it. 15. List the nodes in the exclusion set of each node at the end of the simulation of topology 2. Demonstration: In the demo session, you are required to demonstrate the working of your Part-II implementation including the executions of the routing algorithms. You are also expected to answer questions on the concepts of Network Layer. The dates and the schedule of the demonstrations will be announced later. Project Deliverables: Important Note: You are expected to submit a project report, in PDF format, that documents and explains all the steps you have performed in order to achieve the assigned tasks of the project. A full grade report is one that clearly details and illustrates the execution of the project. Anyone who follows your report should be able to reproduce your performed tasks without effort. Use screenshots to illustrate the steps and provide clear and precise textual descriptions as well. All reports would be analyzed for plagiarism. Please be aware of the KU Statement on Academic Honesty. The name of your project .zip file must be -.zip You should turn in a single .zip file including: ▪ Source codes: Containing the source codes of your completed version of part-II. ▪ -_P3.pdf titled Project Report. o For Part-I, your report should include the answers to the questions and the corresponding Wireshark screenshots. o For Part-II, a brief explanation of the implementation of the requirements supported by code snippets and answers to questions from this part. ▪ Saved capture files from the Wireshark. Figures in your report should be scaled to be visible and clear enough. All figures should have captions, should be numbered according to their order of appearance in the report, and should be referenced and described clearly in your text. All pages should be numbered, and have headers the same as your file naming criteria. If you employ any (online) resources in this project, you must reference them in your report. There is no page limit for your report. Good Luck!

$25.00 View

[SOLVED] Comp301 projects 1 to 4 solution

Problem Definition: To represent a certain set of quantities in a particular way, we are defining a new data type. In PS1, you created two different representations (unary and bignum) of natural numbers. Another example is representing all the integers (negative and non-negative) as diff-trees, where a diff-tree is a list defined by the grammar as the book’s Exercise 2.3 (page 34) which is defined with the following grammar. Diff-tree ::= (one) | (diff Diff-tree Diff-tree) These examples show how data abstraction can be managed with different interfaces and implementations. In this project, you will create a new data-type to represent a quantity of your choice. You can select any quantity to represent such as Natural Numbers, Rational Numbers or Cities on a Map… Part A. Similar to how natural numbers are represented in Unary and BigNum Representations, you will define two new types to represent your selected quantity. In this part, define the grammar definition of your data-types. Part B. Implement these representations in Scheme. For each of these representations, implement the following procedures below: • create: gets an input and creates the new data-type. • is-empty?: returns #t if the representation has no value, otherwise returns #f. • successor: gets a representation a small building block of your data-type and adds it to defined data-type. Part C. Please explain what are Constructors, Observers, Extractors and Predicates. For each procedure explained in Part B, please indicate if they are Constructors, Observers, Extractors or Predicates. Part D. Create 3 new test cases for the following procedures of both representations. • create: Write one case. • is-empty?: Write two test cases for returning one true and one false. • successor: Write one case.In this project, you will work in groups of two or three. To create your group, use the Google Sheet file in the following link: Link to Google Sheets for Choosing Group Members Note: You need to self-enroll to your Project2 group on BlackBoard (please only enroll to the same group number as your group in the Sheets), please make sure that you are enrolled to Project 2 – Group #YourGroup. This project contains a bonus component specified at the end and there are two code boilerplates provided to you: use Project2MYLET for the project and Project2BONUS for the bonus. Submit a report containing your answers to the written questions in PDF format and Racket files for the coding questions to Blackboard as a zip. Include a brief explanation of your team’s workload breakdown in the pdf file. If you attempt to solve the bonus question, make sure that your zip includes both Project2MYLET and Project2BONUS folders separately. Name your submission files as p2_member1IDno_member1username_member2IDno_member2username.zip Example: p2_0011221_galtintas17_0011222_mkarakas16.zip. Please use Project 2 Discussion Forum on Blackboard for all your questions. The deadline for this project is Nov 15, 2020 – 23:59 (GMT+3 : Istanbul Time). Read your task requirements carefully. Good luck! Table 1. Grade Breakdown for Project 2 Question Grade Possible Part A 15 Part B 10 Part C 5 Part D 60 Part E 10 Total 100 Bonus 2 pts 1 Problem Definition: To evaluate the programs, you need to understand the expressions of the language. It is the same for computers; therefore, you saw in the lecture how you can invent a language and define it for the computer to understand and evaluate. In this project, you will define a language named MYLET that is similar to the simple LET language covered in the class. The syntax for the MYLET language is given below. Program ::= Expression a-program (exp1) Expression ::= Number const-exp (num) Expression ::= String str-exp (str) Expression ::= op(Expression, Expression, Number) op-exp (exp1, exp2, num) Expression ::= zero? (Expression) zero?-exp (exp1) Expression ::= if Expression then Expression {elif Expression then Expression}* else Expression if-exp (exp1 exp2 conds exps exp3) Expression ::= Indetifier var-exp (var) Expression ::= let Indetifier = Expression in Expression let-exp (var exp1 body) Figure 1. Syntax for the MYLET language Part A. This part will prepare you for the following parts of the project. (15 pts) (1) Write the 5 components of the language1 : (2) For each component, specify where or which racket file (if it applies) we define and handle them. Part B. In this part, you will create an initial environment for programs to run. (10 pts) (1) Create an initial environment that contains 3 different variables (x, y, and z). (2) Using the environment abbreviation shown in the lectures, write how environment changes at each variable addition. Part C. Specify expressed and denoted values for MYLET language. (5 pts) 1Hint: review Lecture 10 slides 2 Part D. This is the main part of the project where you implement the MYLET language given in the Figure 1 by adding the missing expressions. (1) Add str-exp to the language. Strings are defined as any text starting and ending with ‘, e.g. ‘comp301’, ‘program’; strings are stored with ‘ symbols. (15 pts) Hint: String is an expression that is similar to Number, understanding the addition and implementation of Number may be helpful to complete this step. (2) Add op-exp to the language. (15 pts) op-exp is similar to the diff-exp of the LET language; however, in LET language, the only possible operation was subtraction. op-exp enables you to do 4 arithmetic operations via its third input (Number ), when third input is: • 1: perform addition (exp1 + exp2) • 2: perform multiplication (exp1 * exp2) • 3: perform division (exp1 / exp2) • any other number: perform subtraction (exp1 – exp2) (3) Add if-exp to the language. Unlike if-exp of the LET language, you can add multiple conditions to be checked through elif-then extension. Starting from the condition of if, conditions will be checked until a true condition is found, and expression corresponding to the true condition will be evaluated as a result. If none of the if/elif conditions are correct, the expression in the else statement will be evaluated. (15 pts) (4) Add a custom expression to the language. The expression can be simple, but you need to clearly explain what it does and how it works. You also need to provide the syntax of the expression. (15 pts) Note that the implementation of the other expressions, that are same with the LET language, are already given in the .rkt file provided. We deleted the former implementations of if and diff-exp. Part E. Create the following test cases. (10 pts) (1) custom expression: Write test cases that controls if the expression works according to your explanation of the expression. Note: We provided several test cases for you to try your implementation. Uncomment corresponding test cases and run tests.rkt to test your implementation. Bonus. Here is an alternative datatype ropes that allows manipulation of sequence of characters instead of the most commonly used strings. You can try to implement ropes instead of strings as a bonus challenge. Note: The bonus question is worth 2 points in your overall final grade and no partial credits will be awarded. To get full credit, please implement this problem using the second code boilerplate (Project2BONUS) provided and write at least 6 test cases (two for each: fetch ith character, concatenate, substring) in a clear way to your tests.rkt for us to run. Please make sure that your test cases are clear and tests.rkt doesn’t give any errors, otherwise you won’t be able to receive any credits for this question. Add your code for the bonus problem to your submission as specified in the instructions. Hint: Define your rope datatype similar to the way you did in the project, clearly define your grammar and feel free to use any helper procedures.In this project, you will work in groups of two or three. To create your group, use the Google Sheet file in the following link: Link to Google Sheets for Choosing Group Members Note: You need to self-enroll to your Project 3 group on BlackBoard (please only enroll to the same group number as your group in the Sheets), please make sure that you are enrolled to Project 3 – Group #YourGroup. This project contains a boilerplate provided to you use Project3DataStructures for the project. Submit a report containing your Racket files for the coding questions to Blackboard as a zip. Include a brief explanation of your approach to problems and your team’s workload breakdown in a PDF file. Name your submission files as p3_member1IDno_member1username_member2IDno_member2username.zip Example: p3_0011221_mozcelik17_0011222_hsasmaz16.zip. Important Notice: If your submitted code is not working properly, i.e. throws error or fails in all test cases, your submission will be graded as 0 directly. Please comment out parts that cause to throw error and indicate both which parts work and which parts do not work in your report explicitly. Testing: You are provided some test cases under tests.scm. Please, check them to understand how your implementation should work. You can run all tests by running top.scm. We will test your program with additional cases but your submission should pass all provided test cases. Please use Project 3 Discussion Forum on Blackboard for all your questions. The deadline for this project is Dec 19, 2020 – 23:59 (GMT+3 : Istanbul Time). Read your task requirements carefully. Good luck! Table 1. Grade Breakdown for Project 3 Question Grade Possible Part A 20 Part B 35 Part C 35 Report 10 Total 100 1 Project Definition: In this project, you will implement the most common data structures such as array, stack and queue to EREF. Please, read each part carefully, and pay attention to Assumptions and Constraints section. Part A. In this part, you will add arrays to EREF. Introduce new operators newarray, updatearray, and read-array with the following definitions: (20 pts) newarray: Int * Int -> ArrVal update-array: ArrVal * Int * ExpVal -> Unspecified read-array: ArrVal * Int -> ExpVal This leads us to define value types of EREF as: ArrVal = (Ref(ExpVal))* ExpVal = Int + Bool + Proc + ArrVal + Ref(ExpVal) DenVal = ExpVal Operators of array is defined as follows; newarray(length, value) initializes an array of size length with the value value. update-array(arr, index, value) updates the value of the array arr at index index by value value. read-array(arr, index) returns the element of the array arr at index index. Part B. In this part, you will implement Stack with using arrays that you implemented in Part A. (35 pts) Stack is a data structure that serves as a collection of elements, where the elements are reached in a LIFO (Last In First Out) manner. In other words, when an element is added to the stack, it is added on top of all elements, and when an element is popped from the stack, the topmost element in this data structure will be extracted from the stack. You will implement the following operators of Stack with the given grammar: newstack() returns an empty stack. stack-push(stk, val) adds the element val to the stack stk. stack-pop(stk) removes the topmost element of the stack stk and returns its value. stack-size(stk) returns the number of elements in the stack stk. stack-top(stk) returns the value of the topmost element in the stack stk without removal. empty-stack?(stk) returns true if there is no element inside the stack stk and false otherwise. print-stack(stk) prints the elements in the stack stk. Part C. In this part, you will implement Queue with using arrays that you implemented in Part A. (35 pts) Queue is a data structure that serves as a collection of elements, where the elements are reached in a FIFO (First In First Out) manner. A good example of a queue is any queue of consumers for a resource where the consumer that came first is served first. When an element is popped from the queue, the first element that is pushed in this data structure will be extracted from the queue. 2 You will implement the following operators of Queue with given definitions: newqueue() returns an empty queue. queue-push(queue, val) adds the element val to the stack queue queue-pop(queue) removes the first element of the queue queue and returns its value. queue-size(queue) returns the number of elements in the queue queue queue-top(queue) returns the value of the first element in the stack queue without removal empty-queue?(queue) returns true if there is no element inside the queue queue and false otherwise. print-queue(queue) prints the elements in the queue queue Report. Your report should include the following: (10 pts) (1) Workload distribution of group members. (2) Parts that work properly, and that do not work properly. (3) Your approach to implementations, how does your stack/queue work? Include your report as PDF format in your submission folder. Assumptions and Constraints. Read the following assumptions and constraints carefully. You may not consider the edge cases related to the assumptions. (1) Stack and Queue do not have to be new defined data types, you can utilize the array implementation from Part A. (2) For stack and queue, you may assume that values are integers. (3) For stack and queue, values will be in range [1, 10000]. (4) The number of push operations will not exceed 1000 for a single stack/queue. (5) It is guaranteed that the correct type of parameters will be passed to the operators. For example, in stack-pop(stk), stk always be a stack. (6) If stack/queue is empty, pop operation must return -1. (7) You CANNOT define global variables to keep track of the size or top element of a stack/queue. The reason is we may create multiple stacks/queues and each of them may have different sizes and top elements. Sample Programs. Here are some sample programs for you. let x = newstack() in begin stack-push(x, 20); stack-push(x, 30); stack-push(x, 40); stack-pop(x); print-stack(x) end ;;; 20 30 let x = newstack() in begin stack-push(x, 20); stack-push(x, 30); stack-push(x, 40); stack-pop(x); empty-stack?(x) end ;;; (bool-val #f) let x = newqueue() in begin queue-push(x, 20); queue-push(x, 30); queue-push(x, 40); queue-pop(x); print-queue(x) end ;;; 30 40 let x = newqueue() in begin queue-push(x, 20); queue-push(x, 30); queue-push(x, 40); queue-pop(x); queue-size(x) end ;;; (num-val 2)In this project, you will work in groups of two or three. To create your group, use the Google Sheet file in the following link: Link to Google Sheets for Choosing Group Members. Note: You need to self-enroll to your Project 4 group on BlackBoard (please only enroll to the same group number as your group in the Sheets), please make sure that you are enrolled to Project 4 – Group #YourGroup. This project contains 2 main parts about 2 different topics, namely: Parameter Passing and Continuation Passing Style. Submit a report containing your answers to the written questions in PDF format and Racket files for the coding questions to Blackboard as a zip. Include a brief explanation of your team’s workload breakdown in the pdf file. Name your submission files as: p4_member1IDno_member1username_member2IDno_member2username.zip Example: p4_0011111_baristopal20_0022222_etezcan19.zip. Important Notice: If your submitted code is not working properly, i.e. throws error or fails in all test cases, your submission will be graded as 0 directly. Please comment out parts that cause to throw error and indicate both which parts work and which parts do not work in your report explicitly. Please use Project 4 Discussion Forum on Blackboard for all your questions. The deadline for this project is Due: January 8, 2021 – 23:59 (GMT+3 : Istanbul Time). Read your task requirements carefully. Good luck! Table 1. Grade Breakdown for Project 4 Question Grade Possible 1. Parameter Passing Task 1 10 points Task 2 15 points Task 3 25 points 2. Continuation Passing Style Task 4 8 points Task 5 42 points 1 2 1. Parameter Passing Task 1: Why do these pairs below may give different results sometimes for the same expression: • Call-by-value and call-by-reference • Call-by-need and call-by-name What are the advantages and disadvantages of each? Task 2: To use call-by-need parameter passing variation, some specific changes and additions have to be made to the IREF implementation. 2 of these are given below: ; Change (var-exp (var) (let ((ref1 (apply-env env var))) (let ((w (deref ref1))) (if (expval? w) w (let ((val1 (value-of-thunk w))) (begin (setref! ref1 val1) val1)))))) ; Addition (define value-of-thunk (lambda (th) (cases thunk th (a-thunk (exp1 saved-env) (value-of exp1 saved-env)))) Explain why these code pieces are needed. Analyze how these codes work line by line in detail and state in which file(s) of the IREF implementation code they should be added. Task 3: Write an expression that gives different results in: (1) Call-by-reference and call-by-need (2) Call-by-reference and call-by-name (3) Call-by-value and call-by-need (4) Call-by-value and call-by-name In total, 4 expressions should be written (one for each case). As reference, in the Parameter Passing directory of the Project Assignment zip, codes for all of these 4 parameter passing variations are already provided. Please do not change any files except tests.scm! In all of their tests.scm files, a place is reserved for you to add your expression. Please keep in mind that you should add the same expression in both of the parameter passing variations. In other words, if you wrote an expression that gives different outcomes for instance in call-by-value and call-by-need, please add this expression in both of their tests.scm files. Notes: • If your code gives any error, then you will directly receive 0 points from this task. • For simplicity, assign-exp and begin-exp are also added to the call-by-value codes. • In call-by-value codes, some expressions and structures such as mutable pairs are not defined. Keep these differences in mind while trying to write your expression. 3 2. Continuation Passing Style Task 4: Using Scheme1 , implement a function fibonacci, that takes a parameter n and returns the n th Fibonacci number, with Continuation Passing Style. The Fibonacci sequence goes like: F = [1, 1, 2, 3, 5, 8, 13, . . .] where F[1] = 1, F[2] = 1 and F[n] = F[n − 1] + F[n − 2]. 2 Task 5: You are given a LETREC implementation that has CPS with data-structural representations for continuations. Extend this language to include list and map. 3 Important: Your implementations must use CPS. Furthermore, in your CPS implementations your value-of calls should be tail calls only. In particular, you must see “End of Computation” message appear only once when you run your program. See page 144 of EOPL book for more detail. Here is an example of diff expression continuation with a good CPS and bad CPS usage: ; good usage, value-of is in a tail call (diff1-cont (exp2 saved-env saved-cont) (value-of/k exp2 saved-env (diff2-cont val saved-cont))) (diff2-cont (val1 saved-cont) (let ((num1 (expval->num val1)) (num2 (expval->num val))) (apply-cont saved-cont (num-val (- num1 num2))))) ; bad usage, value-of is not in a tail call (diff1-cont (exp2 saved-env saved-cont) (apply-cont saved-cont (num-val (- (expval->num val) (expval->num (value-of/k exp2 saved-env (end-cont))))))) We have provided test cases for you in tests.scm, and also a few hints can be found within the code as comments. In particular, we have marked where you should write your code in each file as as: ;;;;;;;;;;;;;;;;;;;;;;; TASK 5 ;;;;;;;;;;;;;;;;;;;;;;;; ; some comments ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Do not change anything in the tests.scm file! If you would like to run your own code, write it in the console under top.scm. List Implementation. Your list implementation will be similar to how we construct arrays from mutable pairs, or how a Scheme list is constructed as pairs. In fact, this part of the task is very similar to exercise 5.6 of EOPL book, at page 153. You will add two new values to the language: • pair value • emptylist value 1You do not need to extend a language or anything, just write a plain Scheme code. 2 In some mathematical contexts the sequence starts with 0 instead of 1, but this way is a bit easier to implement. 3Hint: You will need to make changes in interp.scm, data-structures.scm and lang.scm. 4 A list expression looks like: list(exp1, exp2, …, expN) The list is composed of pairs. Here is an example: > (run “list(1,2)”) End of computation. (pair-val (num-val 1) (pair-val (num-val 2) (emptylist-val))) The basic list operations you will implement are: • car(expression) returns the left part of the pair value. • cdr(expression) returns the right part of the pair value. • null?(expression) returns true if the expression is an emptylist value. • emptylist actually creates an empty list, with the value emptylist. You will also need to implement 2 extractors and a predicate: • expval->car extracts the car of the expressed pair value. • expval->cdr extracts the cdr of the expressed pair value. • expval-null? returns true if the expressed value is an emptylist value. There are several examples in tests.scm, but here is one that covers most of these operations. > (run “let x = 3 in let arr = list(x, -(x,1)) in let y = if null?(arr) then 0 else car(cdr(arr)) in y”) End of computation. (num-val 2) Note that in this example, just cdr(arr) does not yield 2, but rather we have to do car(cdr(arr)). This is because in fact the first cdr yields a: (pair-val (num-val 2) (emptylist-val)) Map Implementation. The map expression looks like: map(expression, expression) Here, the first expression will be treated like a proc expression with one parameter, and the second expression will be treated like a list expression. As an example, here is subtracting 5 from each element of the list: > (run “map(proc (v) -(v,5), list(5, 10, 2))”) End of computation. (pair-val (num-val 0) (pair-val (num-val 5) (pair-val (num-val -3) (emptylist-val)))) When you run top.scm the tests will run automatically. If everything works fine, you will see “no bugs found” message at the bottom of the console. Even if no bugs are found, if for some test you see more than one “End of Computation.” message, then there is something wrong with how you implemented CPS.

$25.00 View

[SOLVED] Ece472 assignments 1 to 4 solution

tldr: Perform linear regression of a noisy sinewave using a set of gaussian basis functions with learned location and scale parameters. Model parameters are learned with stochastic gradient descent. Use of automatic differentiation is required. Hint: note your limits! Problem Statement Consider a set of scalars {x1, x2, . . . , xN } drawn from U(0, 1) and a corresponding set {y1, y2, . . . , yN } where: yi = sin (2πxi) + ϵi (1) and ϵi is drawn from N (0, σnoise). Given the following functional form: yˆi = ∑ M j=1 wjϕj (xi | µj , σj ) + b (2) with: ϕ(x | µ, σ) = exp −(x − µ) 2 σ 2 (3) find estimates ˆb, {µˆj}, {σˆj}, and {wˆj} that minimize the loss function: J(y, yˆ) = 1 2 (y − yˆ) 2 (4) for all (xi , yi) pairs. Estimates for the parameters must be found using stochastic gradient descent. A framework that supports automatic differentiation must be used. Set N = 50, σnoise = 0.1. Select M as appropriate. Produce two plots. First, show the data-points, a noiseless sinewave, and the manifold produced by the regression model. Second, show each of the M basis functions. Plots must be of suitable visual quality. −4 −2 0 2 4 x −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 y Fit 1 −4 −2 0 2 4 x 0.0 0.2 0.4 0.6 0.8 1.0 y Bases for Fit 1 −4 −2 0 2 4 x −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 y Fit 2 −4 −2 0 2 4 x 0.0 0.2 0.4 0.6 0.8 1.0 y Bases for Fit 2 Figure 1: Example plots for models with equally spaced sigmoid and gaussian basis functions.tldr: Perform binary classification on the spirals dataset using a multi-layer perceptron. You must generate the data yourself. Problem Statement Consider a set of examples with two classes and distributions as in Figure 1. Given the vector x ∈ R 2 infer its target class t ∈ {0, 1}. As a model use a multi-layer perceptron f which returns an estimate for the conditional density p(t = 1 | x): f : R 2 → [0, 1] (1) parametrisized by some set of values θ. All of the examples in the training set should be classified correctedly (i.e. p(t = 1 | x) > 0.5 if and only if t = 1). Impose an L 2 penalty on the set of parameters. Produce one plot. Show the examples and the boundary corresponding to p(t = 1 | x) = 0.5. The plot must be of suitable visual quality. It may be difficult to to find an appropriate functional form for f, write a few sentences discussing your various attempts. −10 −5 0 5 10 −15 −10 −5 0 5 10 15 Spirals Figure 1: Sample spiral data.tldr: Classify mnist digits with a (optionally convoultional) neural network. Get at least 95.5% accuracy on the test test. Problem Statement Consider the mnist dataset consisting of 50,000 training images, and 10,000 test images. Each instance is a 28 × 28 pixel handwritten digit zero through nine. Train a (optionally convolutional) neural network for classification using the training set that achieves at least 95.5% accuracy on the test set. Do not explicitly tune hyperparameters based on the test set performance, use a validation set taken from the training set as discussed in class. Use dropout and an L 2 penalty for regularization. Note: if you write a sufficiently general program the next assignment will be very easy. Do not use the built in mnist data class from tensorflow. Extra challenge (optional) In addition to the above, the student with the fewest number of parameters for a network that gets at least 80% accuracy on the test set will receive a prize. There will be an extra prize if any one can achieve 80% on the test set with a single digit number of parameters. For this extra challenge you can make your network have any crazy kind of topology you’d like, it just needs to be optimized by a gradient based algorithm.tldr: Classify cifar. Acheive performance similar to the state of the art. Classify cifar. Achieve a top-5 accuracy of 80%. Problem Statement Consider the cifar and cifar datasets which contain 32 × 32 pixel color images. Train a classifier for each of these with performance similar to the state of the art (for cifar. It is your task to figure out what is state of the art. Feel free to adapt any techniques from papers you read. I encourage you to experiment with normalization techniques and optimization algorithms in this assignment. Write a paragraph or two summarizing your experiments. Hopefully you’ll be able to resuse your mnist program.

$25.00 View

[SOLVED] Com s/se 319 : hw 1 to 5 solutions

1. APPLICATION SECTION: Server Client/Thread Create a chat application. You will need to create both the server and the client codes. The server and clients should run on localhost. Below are some features that you should incorporate. Note, we may have under-specified what you need to do. If so, make up your own rules on what to do for those situations. In other cases, follow the requirements carefully. 1.1 Connect to Server (15 points) a) When you start a client, it should come up with a prompt, you should make it entirely text based. (5 points) > Enter your Name: (Type in your name, then press Enter) b) After the user enters a name, the client should be connected to your server. (10 points) 1.2 Send text message to server (25 points) a) Send user name and message to server. (5 points) b) Then the server broadcast client’s text message to every other currently connected clients (i.e. not to the sending client). (10 points) c) Messages should be printed in each client’s console and the server’s console. (10 points) 1 2. What to Submit: Submit via Canvas a compressed file (.zip) (rename it with your LAST NAME) containing the following: 1. Zip your Eclipse project and submit on Canvas along with README file and the report. Make sure to include all the files that are needed in order to run your program(s). [ All APPLICATION SECTION points = 5+10+5+10+10 =40 points] 2. A README file explaining how to compile and run the program. [5 points] 3. A report (.docx or .pdf) describing your solution approach and screenshots of every required output. [5 points].1. Warm Up: Try Some Examples (HTML & Javascript) a. First, open canvas, go to Assignments, and then download HW02.zip file into your workspace (U:workspace or something like that!). Then, unzip. b. Play with each of the given examples (in the examples directory). Open them using a text editor of your choice and modify parts of the html or js files to learn how the different instructions work. If you want to use eclipse instead of notepad or vim or emacs etc., create a new static web project and create new html file and open it with a browser. Note: w3schools.com is a good site to learn about web technologies. Note that the assignment assumes you have understood these examples. Note: Please always use relative path in your homeworks and Portfolios 2. Form Validation 2.1 Create a form in HTML and validate entries of the form using javascript. [25 points] 2.1.1 Create two files validation1.html and validation1.js. 2.1.2 The TITLE of the validation1.html page should be “Validation Form”. 2.1.3 Create a HTML form in validation1.html: a) Containing the fields as in the table below. b) In addition, it should also have a “Continue” button. c) Make it look reasonably good. Validation rules will be explained in next step. 1 FIELD LABEL Field Type Validation rule RESULT First name TextField *Required. Must contain only alphabetic or numeric characters. / Last name TextField *Required. Must contain only alphabetic or numeric characters. / Gender Dropdown(male, female) *Required / State Dropdown(Californi a, Florida, New York, Texas, Hawaii, Washington, Colorado,Virginia, Iowa, Arizona) *Required Select from given list.. / *Required field = Cannot be Empty. d) Read https://www.w3schools.com/js/js_validation.asp. e) Now, write Javascript code in validation1.js so that when user clicks “Continue” button it does the following: ● It validates the entries (as per the table above) and for each entry displays if the validation was successful, else it displays . (These images are included in the lab’s zip file as correct.png and wrong.png.) ● Once the validation is successful, it goes to the next page (validation2.html) f) Remember to include validation1.js in the head element of validation.html. 2.2 Create a form in HTML and validate entries of the form using javascript. [20 points] 2.2.1 Create two files validation2.html and validation2.js. 2 2.2.2 The TITLE of the validation2.html page should be “Contact information”. 2.2.3 Create a HTML form in validation2.html: a) Containing the fields as in the table below. b) In addition, it should also have a “Submit” button. c) Make it look reasonably good. Validation rules will be explained in next step. FIELD LABEL Field Type Validation rule RESULT Email TextField *Required. Must be in the form [email protected] x should be alphanumeric (e.g. no special symbols). / Phone TextField *Required. Must be in the form xxx-xxx-xxxx or xxxxxxxxxx. x should be numeric / Address TextField *Required Must be in the form of city & state. example: Ames,IA / *Required field = Cannot be Empty. d) Read https://www.w3schools.com/tags/att_input_pattern.asp. Also, do look at validation example in ExamplesJS folder. e) Write Javascript code in validation2.js to validate the form as per the rules in the above table when the user clicks Submit button f) Your code should display if the validation was successful, or if there was an error, display . g) Remember to include validation2.js in the head section of validation2.html. 3 2. What to Submit: Make sure your solutions work on Chrome (which is what TAs will use to grade the assignment). Submit via Canvas a compressed file (.zip) (rename it with your LAST NAME) containing the following: 1. All the files (.html and .js) which are needed in order to run your program(s). [45 points] 2. A report (.docx or .pdf) describing your solution approach and screenshots of every required output. [5 points].This assignment is focused on UI and event driven programming and Event Handling Task 1: UI and Event Driven Programming: (30 points) Objectives: Learn to use Javascript objects, functions, and closures to implement UI and event driven programming. Warm-up: NOTE 1: One suggestion (to help you play with javascript) is to use online Javascript code tool like https://codepen.io/pen/ or https://jsbin.com. They are very useful for trying javascript examples as you can change the html or javascript directly on the website, and you can immediately see the results of your changes. NOTE 2: You will need to also learn how to use the available tools for JS debugging. Firefox has tools->WebDeveloper->Debugger, Chrome has Tools->Developer Tools (ctrl-shift-I). NOTE 3: Play with each of the given examples (in the examples directory). Open them using a text editor of your choice and modify parts of the html or js files to learn how the different instructions work. Task: A complete example of another program (Matching game) is provided in folder SampleProgram. Please take a look at that one first. A starting template is provided in folder ExerciseHelp. Your assignment is to use this template to create a simple decimal calculator programs using objects, functions, and closures. This calculator should look approximately like the below picture. You can look at a normal calculator to figure out the functionality of M+, M-, MR and MC. For decimal calculator, 1. “ + ”, “ – ”, “ * ” , “ / ” should be used respectively for addition, subtraction, multiplication and division. (2×4 = 8 points) 2. “ . ” should be used for operation with decimals. (3 points) 3. Negative number operations e.g., “(-2)-3 = -5” (3 points) 4. Assume that the calculator does not need to calculate complex operations such as 5 + 5 * 5. Instead, expect users to press “=” operator after a basic operation. So, press 5 + 5 followed by =. At this point it should show 10. Then, press “*” and then 5 followed by “=”. At this point show 50. When an operator button is pressed, the operator button’s font becomes red. In other words, assume that we are expecting user to enter only “operand1 operator operand2 = “. However, we can use the results of the previous operation as the first operand for the next operation. Check list: [ ] Your javascript file should be named “calculator.js”. [ ] Use relative path in all of your files. [ ] Name your Objects based on their purpose. Do the same with your JavaScript functions. [ ] Show UI Display for decimal calculator correctly. (3 points) [ ] MR (shows memory value on screen) (2 points) [ ] MC (clears memory value) (2 points) [ ] M+ (Whatever is on screen gets added to memory) (2 points) [ ] M- (Whatever is on screen gets subtracted from memory) (2 points) [ ] C (clears screen value, clear the last operation, press “=” will not repeat the last operation) (2points) [ ] = (shows results of an operation) and highlight the last button (any digit/ operator) clicked (3 points) [ ] Make sure that your variables are not global (so that if someone includes some other js files with same names for variables, then your code still works ok). Task 2: Event Handling (15 points) Write a Javascript and HTML code (named snake.html and snake.js) to implement the functionality shown in ‘Problem2Output.mp4’ included in the zip file. Note: 1. The line you create can go over any previous paths. [4 points] 2. The line will bend left when left button is clicked. [4 points] 3. The line will bend right when right button is clicked. [4 points] 4. The line should stop if it touches any boundary. [3 points] Hints: 1. Use HTML5 Canvas (see https://www.w3schools.com/graphics/canvas_intro.asp) 2. Make sure to use a timer (see example below) to update the canvas (so that the snake keeps moving). A Timer has two main functionalities that can be used in the project. a. The setInterval(function, delay) schedules the “code” after every “delay” microseconds. b. The clearInterval removes the timer Here is an example of timer code. This will countdown from 100 until you press stop! What to Submit: Make sure your solutions work on Chrome as TAs will use it to grade the assignment. Submit via Canvas a compressed file (.zip) containing the following: ● lab.html, calculator.js, for Task 1 and snake.html and snake.js for Task 2. [Task 1+Task 2 = 30+15 = 45 Points] ● README file explaining how to compile and run your program & a Report (.docx or .pdf) describing your solution approach and screenshots of every required output. [5 points].This assignment is focused on node.js Task 1: (45 points) Objectives: Learn to use node.js programming. Warm-up: NOTE 1: Play with the given example. Open using a text editor of your choice and modify to learn how the different instructions work. Task: *It will be a console based application: Your assignment is to create a simple binary calculator programs. This calculator should look approximately like the given warm-up exercise. For binary calculator, 1. Note that for some operations on the binary calculator, it may be more convenient to convert the binary numbers to integers and then do the operation. (It is a suggestion, you can implement your own logic). 2. You can assume that only positive binary numbers are represented and used. For example, positive 9 is represented as 1001. 3. Binary operator “+” represents plus operation (5 points) 4. Binary operator “*” represents multiply (5 points) 5. Binary operator “/” represents division (5 points) 6. Binary operator “%” represents mod or remainder (i.e. divide the first value by the second, what is remaining, only works on positive numbers) (5 points) 7. Unary operator “> gives 10) (5 points) 9. Binary operator “&” represents AND (only works on positive numbers) e.g. (101 & 1011 gives 0001) (5 points) 10. Binary operator “|” represents OR (only works on positive numbers) e.g. (101 | 1010 gives 1111) (5 points) 11. Unary operator “~” represents not (i.e. invert each bit of the binary value, only works on positive numbers) e.g. (101 ~ gives 10) (5 points) What to Submit: Submit via Canvas a compressed file (.zip) containing the following: ● code(s) for Task 1. [Task 1= 45 Points] ● README file explaining how to compile and run your program & a Report (.docx or .pdf) describing your solution approach and screenshots of every required output. [5 points].Task: Implement a Turn Based human vs human tic-tac-toe game with suitable GUI. Typically Tic-tac-toe (also known as noughts and crosses or Xs and Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game. The given example of the game is won by the first player, X which has been illustrated in the below figure 1 : (More about Tic-tac-toe:https://en.wikipedia.org/wiki/Tic-tac-toe) Figure 1: Tic-tac-toe Game You have to implement this task using Java code and JavaFX GUI components. Check list: 1. Use the provided images (included in the zip file) for marking X and O. [5 points] 2. Show which player’s turn while playing the game. [5 points] 3. Click on the blank cell to mark X or O (unmarked cell should be checked and marked cell can not be marked again). [10 points] 4. When one player wins, stop the game and show ”Congratulations, X win the game” or ”Congratulations, O win the game” in your designed GUI. [10 points] 5. When all cells are filled in and no one wins, stop the game and show ”Draw”. [10 points] 6. When the game is over, show the option to restart a new game. [5 points] 1 What to Submit: Submit via Canvas a compressed file (.zip) [rename it with your LAST NAME] containing the following: ● All of your source code (e.g., .java files). [Task 1= 45 Points] ● README file explaining how to compile and run your program & a Report (.docx or .pdf) describing your solution approach and screenshots of every required output. [5 points]. _________________________________

$25.00 View

[SOLVED] Ece365 programming assignment 1 to 4 solutions

First, you are going to create a hash table class. Then you are going to write a program that uses your hash table class to read in a “dictionary” and spell check a “document”. For the purposes of this assignment, a valid word is defined as any sequence of valid characters, and the valid characters are letters (capital and lowercase), digits (0 – 9), dashes (-), and apostrophes (‘). Every other character is considered a word separator. A dictionary is defined as a list of recognized words. The dictionary is guaranteed to contain exactly one word per line, with no leading or trailing spaces, followed by a single, Unix-style newline character ( ). Some of the words in the dictionary might not be valid (i.e., they may contain invalid characters). When loading the dictionary, invalid words, and also words that are too long (see below), can optionally be ignored. The dictionary does not specify the meanings of words; it just lists them. The document to spell check may be any valid text file. Each line in the document will end with a single, Unix-style newline character. When spell checking the document, your program should indicate every unrecognized word, including the line number on which it occurs. Words should only be allowed to grow up to 20 characters. If a word in the document is too long, you should indicate the line number on which this occurs along with the first 20 characters of the word. The first line in the document is line 1. Words in the document that include digits (perhaps in addition to other valid characters) are technically valid but should not be spell checked (i.e., your program should ignore them). In the document, as previously stated, every character that is not a valid word character is a word separator; e.g., the string “abc@def” represents two valid words, “abc” and “def”. Therefore, there cannot be invalid words in the document. Your program should be case insensitive, and all capital letters in both the dictionary and the document should be converted to lowercase immediately upon seeing them. Your program must be written in C++. In order to implement this task efficiently, you will use a hash table. You must implement a hash table class using separate files, including a header file and a source code file. Not every member function of the class will be necessary for this assignment, but you will reuse this class for your next two assignments. Since our textbook provides code for the separate chaining and quadratic probing collision resolution strategies, I am requiring that you use either linear probing or double hashing. (Linear probing is a bit simpler to implement, and you will not receive any extra credit if you choose double hashing.) You are welcome to look at the book’s code for the other two strategies, but keep in mind that the instructions I am specifying for your hash table class make this different than the book’s implementation in several ways. For example, the book uses templates for its hash table class, but you will not. Also, your hash table class will allow the programmer to associate additional data with each entry, while the book’s implementation does not. More details about the requirements for your hash table class will be discussed later in this handout and in class. To process the dictionary, simply insert every word in the dictionary into the hash table. To spell check the document, locate every valid word in the document (keeping track of line numbers), and lookup (i.e., search for) each word in the hash table to see if it is recognized. You should assume that an average dictionary contains about 50,000 words, but that some might be as large as 1,000,000 words. This is my way of telling you that you should implement a rehash member function! A sample dictionary, a bit on the small side (approximately 25,000 words), will be posted on the course home page. Your program should prompt the user for the name of the dictionary file, the name of the document file to be spell-checked, and the name of the file where output should be written. Your program should indicate how long, in seconds, it takes to read the dictionary and how long it takes to spell check the text file, measured in terms of CPU time. (These times should be displayed to standard output, not to the output file.) Your program must compile and run correctly using the g++ compiler on either Cygwin or Ubuntu. Your hash table implementation must include a header file called “hash.h” and a source code file called “hash.cpp”. The spell-checking code, and the rest of the main program, should be included in a separate file. You should also create a “Makefile” that I can use to compile your program. I will provide a sample “Makefile” that I used for my version of the program (also shown on the final page of this handout). I will also provide my version of “hash.h” (also shown later in this handout). You may reuse these two files directly if you wish. These files will also be posted on the course home page and discussed in class. A sample document, a sample run of the program using that document, and a sample output file appear on pages 3 and 4 of this handout. The document file used for this sample run, as well as the dictionary used, will be available from the course home page. Your output file should adhere exactly to the format shown, and all the messages should be worded exactly the same way, with the same spacing. I will use “diff” to compare your output to mine, and you will lose points for any differences. (Of course, I will test your programs on multiple test cases involving different documents and dictionaries with various sizes.) Note that the output displayed to standard output does not have to match my formatting, as long as the content is the same. Pages 5 and 6 of this handout show my “hash.h” file. Your hash table class must implement the same public member functions as mine. Note that the “getPointer”, “setPointer”, and “remove” member functions will not be used for this assignment; however, they will be used for future assignments! It is OK if you do not implement these member functions now. The specifics of this header file, and also a “Makefile” (shown on page 7 of this handout), will be discussed in more detail in class. Both files will also be made available to you from the course home page. When your assignment is complete, e-mail me ([email protected]) your program, including your source code files, your header file(s), and your “Makefile” (even if you used the provided files). In addition to correctness, your grade may also depend on the efficiency and elegance of your code and adherence to proper C++ style. Your program is due before midnight on the night of Wednesday, September 26. Below are the lyrics to “Supercalifragilisticexpialidocious” from “Mary Poppins”. This represents the contents of the document “lyrics.txt” used in the sample run shown on the next page. This file will also be posted on the course home page. Um-deedledeedledeedle um-deedledayUm-deedledeedledeedle um-deedledayUm-deedledeedledeedle um-deedledeedleUm-deedledeedledeedle um-um um-um um-um For example… SupercalifragilisticexpialidociousEven though the sound of it is something quite atrociousIf you say it loud enough you’ll always sound precociousSupercalifragilisticexpialidocious Um-deedledeedledeedle um-deedledayUm-deedledeedledeedle um-deedledayUm-deedledeedledeedle um-deedleday Super-superSupercaliSuper Supercalifragi So when the cat has got your tongue there’s no need for dismayJust summon up this word and then you’ve got a lot to sayBut better use it carefully or it can change your life For example… Yes? One day I said it to me girl and now me girl’s me wife SupercalifragilisticexpialidociousEven though the sound of it is something quite atrociousIf you say it loud enough you’ll always sound precociousSupercalifragilisticexpialidocious SupercalifragilisticexpialidociousEven though the sound of it is something quite atrociousIf you say it loud enough you’ll always sound precociousSupercalifragilisticexpialidocious SupercalifragilisticexpialidociousEven though the sound of it is something quite atrociousIf you say it loud enough you’ll always sound precociousSupercalifragilisticSupercalifragilisticSupercalifragilisticexpialidocious Below is a sample run using the sample dictionary provided on the course home page and a text file that contains the lyrics to “Supercalifragilisticexpialidocious” from “Mary Poppins”. Enter name of dictionary: DICT/wordlist_smallTotal time (in seconds) to load dictionary: 0.031Enter name of input file: FILES/lyrics.txtEnter name of output file: out_lyrics_small.txtTotal time (in seconds) to check document: 0 The output file should look exactly like this: Long word at line 1, starts: um-deedledeedledeedlUnknown word at line 1: um-deedledayLong word at line 2, starts: um-deedledeedledeedlUnknown word at line 2: um-deedledayLong word at line 3, starts: um-deedledeedledeedlUnknown word at line 3: um-deedledeedleLong word at line 4, starts: um-deedledeedledeedlUnknown word at line 4: um-umUnknown word at line 4: um-umUnknown word at line 4: um-umLong word at line 8, starts: supercalifragilisticLong word at line 11, starts: supercalifragilisticLong word at line 13, starts: um-deedledeedledeedlUnknown word at line 13: um-deedledayLong word at line 14, starts: um-deedledeedledeedlUnknown word at line 14: um-deedledayLong word at line 15, starts: um-deedledeedledeedlUnknown word at line 15: um-deedledayUnknown word at line 17: super-superUnknown word at line 18: supercaliUnknown word at line 19: supercalifragiUnknown word at line 21: hasUnknown word at line 21: there’sUnknown word at line 21: dismayUnknown word at line 23: betterUnknown word at line 23: carefullyUnknown word at line 27: yesUnknown word at line 29: girl’sLong word at line 31, starts: supercalifragilisticLong word at line 34, starts: supercalifragilisticLong word at line 36, starts: supercalifragilisticLong word at line 39, starts: supercalifragilisticLong word at line 41, starts: supercalifragilisticUnknown word at line 44: supercalifragilisticUnknown word at line 45: supercalifragilisticLong word at line 46, starts: supercalifragilistic Below and on the next page, I am providing you with the header file (“hash.h”) for my hash table implementation. This file will also be posted on the course home page. #ifndef _HASH_H#define _HASH_H #include #include  class hashTable { public: // The constructor initializes the hash table.// Uses getPrime to choose a prime number at least as large as// the specified size for the initial size of the hash table.hashTable(int size = 0); // Insert the specified key into the hash table.// If an optional pointer is provided,// associate that pointer with the key.// Returns 0 on success,// 1 if key already exists in hash table,// 2 if rehash fails.int insert(const std::string &key, void *pv = NULL); // Check if the specified key is in the hash table.// If so, return true; otherwise, return false.bool contains(const std::string &key); // Get the pointer associated with the specified key.// If the key does not exist in the hash table, return NULL.// If an optional pointer to a bool is provided,// set the bool to true if the key is in the hash table,// and set the bool to false otherwise.void *getPointer(const std::string &key, bool *b = NULL); // Set the pointer associated with the specified key.// Returns 0 on success,// 1 if the key does not exist in the hash table.int setPointer(const std::string &key, void *pv); // Delete the item with the specified key.// Returns true on success,// false if the specified key is not in the hash table.bool remove(const std::string &key);  private: // Each item in the hash table contains:// key – a string used as a key.// isOccupied – if false, this entry is empty,//              and the other fields are meaningless.// isDeleted – if true, this item has been lazily deleted.// pv – a pointer related to the key;//      NULL if no pointer was provided to insert.class hashItem {public:std::string key;bool isOccupied;bool isDeleted;void *pv;}; int capacity; // The current capacity of the hash table.int filled; // Number of occupied items in the table. std::vector data; // The actual entries are here. // The hash function.int hash(const std::string &key); // Search for an item with the specified key.// Return the position if found, -1 otherwise.int findPos(const std::string &key); // The rehash function; makes the hash table bigger.// Returns true on success, false if memory allocation fails.bool rehash(); // Return a prime number at least as large as size.// Uses a precomputed sequence of selected prime numbers.static unsigned int getPrime(int size);}; #endif //_HASH_H This page shows the “Makefile” that I used for my program. This will also be posted on the course home page. spell.exe: spellcheck.o hash.og++ -o spell.exe spellcheck.o hash.o spellcheck.o: spellcheck.cpp hash.hg++ -c spellcheck.cpp hash.o: hash.cpp hash.hg++ -c hash.cpp debug:g++ -g -o spellDebug.exe spellcheck.cpp hash.cpp clean:rm -f *.exe *.o *.stackdump *~ backup:test -d backups || mkdir backupscp *.cpp backupscp *.h backupsYou are going to create a class called “heap” that provides programmers with the functionality of a priority queue using a binary heap implementation. Each item inserted into the binary heap will specify a unique string id, an integer key, and optionally any pointer. The implementation of the class should use pointers to void in order to handle pointers to any type of data. When a heap is declared, a capacity will be passed to its constructor representing the maximum number of items that may be in the heap at one time; the heap will never be allowed to grow past its initial capacity (although it is not difficult to implement a resize operation). I have written a program that uses my own implementation of the class. I will provide you with that program (useHeap.cpp) and with a Makefile. You should not change my source code file! You are allowed to add c++11 flags to the Makefile if you need to. Both files will be discussed in class and can be obtained from the course home page. Your heap will also make use of the hash table class that you created for the previous programming assignment. This assignment asks you to fill in the missing heap.cpp and heap.h files, and to correct or add to your hash.cpp file if necessary, so that everything works. This implies that your heap class must include at least the following: a constructor that accepts an integer representing the capacity of the binary heap; a public member function, insert, used to insert a new item into the heap; a public member function, deleteMin, that removes the item with the lowest key from the heap; a public member function, setKey, providing both increaseKey and decreaseKey functionality; and a public member function, remove, that allows the programmer to delete an item with a specified id from the heap. In class we will discuss the parameters of these member functions and their return values. In addition, your class should contain private data members and private member functions that allow you to elegantly and efficiently implement the required public member functions. I will discuss my own implementation in class, and it is described on the next page. In class, we will look at a sample run of the program and discuss the provided code. This program only passes string ids and integer keys to the insert member function of the heap class, but again, the insert member function should also optionally accept any pointer that can be stored and associated with the id. In the future, you will be using the class you write for this assignment in order to implement an algorithm involving graph data structures, and this functionality will be necessary. Also note that the integer keys will not necessarily be positive integers. All operations should be implemented using average-case logarithmic time (or better) algorithms. In order to achieve setKey and remove in average-case logarithmic time, your program needs to be able to map an id to a node quickly. Since each id can be any arbitrary string, a hash table is useful for this purpose. Searching a heap to find an item with a particular id would require linear time, but a hash table in which each hash entry includes a pointer to the associated node in the heap allows you to find the item in constant average time. Apart from the calls to the hash table member functions, which are worst-case linear time but average-case constant time operations, all heap operations should use worst-case logarithmic time algorithms, and the insert operation should use an average-case constant time algorithm. My heap class contains four private data members. Two are simple integers representing the capacity and the current size of the heap. The third is a vector of node objects containing the actual data of the heap; each node contains a string id, an integer key, and a pointer to void that can point to anything. (I have made “node” a private nested class within the heap class.) The fourth private data member is a pointer to a hash table (the actual hash table is allocated in the heap’s constructor). Since the constructor is provided with the maximum size of the heap, you may allocate the hash table to be large enough such that there is a small likelihood of a rehash, but that is up to you. (Note that since items get removed from the heap, but only lazily deleted from the hash table, it is still possible that a rehash of the hash table will be necessary.) Your heap.h file should contain the declaration of your class along with the declarations of its public and private data members and member functions. The heap.cpp file should contain the implementation of the class. I don’t think you should need to implement any functions other than the member functions of the class itself in this file (I did not). As usual, you will be graded not only on the correctness of your program, but also on the appropriateness of the decisions that you make, the elegance (and perhaps the formatting) of your code, and on the appropriate use of C++ concepts and routines. The following page shows the declarations of the constructor and the public member functions of my heap class along with my comments describing their functionalities, parameters, and return values. I am not showing the declarations of my private data members or private member functions here, but this will be discussed further in class. E-mail me ([email protected]) your program, including all source code files, head files, and your Makefile (including any provided files that you use without making changes). Your program must compile and run using either Ubuntu or Cygwin. The program is due before midnight on the night of Wednesday, October 24. //// heap – The constructor allocates space for the nodes of the heap// and the mapping (hash table) based on the specified capacity//heap(int capacity); //// insert – Inserts a new node into the binary heap//// Inserts a node with the specified id string, key,// and optionally a pointer. They key is used to// determine the final position of the new node.//// Returns://   0 on success//   1 if the heap is already filled to capacity//   2 if a node with the given id already exists (but the heap//     is not filled to capacity)//int insert(const std::string &id, int key, void *pv = NULL); //// setKey – set the key of the specified node to the specified value//// I have decided that the class should provide this member function// instead of two separate increaseKey and decreaseKey functions.//// Returns://   0 on success//   1 if a node with the given id does not exist//int setKey(const std::string &id, int key); //// deleteMin – return the data associated with the smallest key//             and delete that node from the binary heap//// If pId is supplied (i.e., it is not NULL), write to that address// the id of the node being deleted. If pKey is supplied, write to// that address the key of the node being deleted. If ppData is// supplied, write to that address the associated void pointer.//// Returns://   0 on success//   1 if the heap is empty//int deleteMin(std::string *pId = NULL, int *pKey = NULL, void *ppData = NULL); //// remove – delete the node with the specified id from the binary heap//// If pKey is supplied, write to that address the key of the node// being deleted. If ppData is supplied, write to that address the// associated void pointer.//// Returns://   0 on success//   1 if a node with the given id does not exist//int remove(const std::string &id, int *pKey = NULL, void *ppData = NULL);You are going implement Dijkstra’s algorithm to solve the single-source shortest-path problem. The program will determine the shortest path in a specified graph from a specified starting vertex to each other vertex in the graph. In order to do this efficiently, your program should use the binary heap class that you created for the previous assignment. Your program should start by asking the user to enter the name of a file specifying the graph. Every row in the input file represents an edge in the graph. Each row consists of two string ids representing the source vertex and destination vertex of the edge (in that order) followed by an integer representing the cost (a.k.a. distance or weight) of the edge. The rows will contain no leading or trailing whitespace, single spaces will separate fields, and all rows will end with a single Unix-style newline character. All vertex ids will consist only of lowercase and capital letters and digits. All edge costs will be positive integers less than one million. A vertex exists if it is the source or the destination of any edge. The source vertex of an edge will never be the same as the destination vertex, but it is possible that multiple edges might connect the same vertices. Your program may assume that the file, if it can be opened, is valid. You are not required to include error checks for invalid file formats; you may if you wish, but I will not check for this. Once the program is finished reading in the graph, the user should be prompted to enter the id of a starting vertex. The user should be re-prompted until they enter a valid index (i.e., a string id representing a vertex that exists in the graph). The program should then apply Dijkstra’s algorithm to determine the shortest path to each node from the specified starting vertex. The implementation should rely on the binary heap class that you created for the previous assignment. (The heap class, of course, relies on the hash class you created for the first assignment, and you will also likely rely on the hash class for a couple of other purposes as well.) When the algorithm has finished determining the shortest path to each node, your program should output the CPU time, in seconds, that was spent executing the algorithm. The program should then ask the user for the name of an output file. The output file should contain one row for every vertex that exists in the graph, with vertices listed in the same order that they first appear in the input file. Each row in the output file should contain a vertex id followed by a colon, a single space, and then the shortest distance from the specified starting vertex to the given vertex. All of these distances are guaranteed to be less than one billion. After the distance, the row should contain one space, a left bracket, the path from the starting vertex to the current vertex, a right bracket, and finally a single Unix-style newline character. Vertices in the path should be separated by a comma followed by a single space. There should not be any space or comma before the first vertex in the path (the specified starting vertex) or after the last vertex in the path. If there is no path from the specified starting vertex to any existing vertex in the graph, the corresponding output row should contain the vertex id followed by a colon, a single space, and then the text “NO PATH” followed by a single Unix-style newline character. You must follow these instructions exactly. In class, we stepped through Dijkstra’s algorithm for the following graph, which came from Figure 9.20 in the textbook:  The file representing this graph might look like this: v1 v2 2v1 v4 1v2 v4 3v2 v5 10v3 v1 4v3 v6 5v4 v3 2v4 v5 2v4 v6 8v4 v7 4v5 v7 6v7 v6 1 Any permutation of the rows representing edges in this file would designate the same graph (but the order of rows in the output file might be different). Assume a file called graph.txt exists, containing the data shown above. Then a sample run of your program might look like this: Enter name of graph file: graph.txtEnter a valid vertex id for the staring vertex: v1Total time (in seconds) to apply Dijkstra’s algorithm: 0.000Enter name of output file: out.txt The prompts to the user may vary, but the file out.txt should look exactly like this: v1: 0 [v1]v2: 2 [v1, v2]v4: 1 [v1, v4]v5: 3 [v1, v4, v5]v3: 3 [v1, v4, v3]v6: 6 [v1, v4, v7, v6]v7: 5 [v1, v4, v7] If the user specifies the same graph file but enters v5 as the id of the starting vertex, then the output file should look exactly like this: v1: NO PATHv2: NO PATHv4: NO PATHv5: 0 [v5]v3: NO PATHv6: 7 [v5, v7, v6]v7: 6 [v5, v7] As already stated, you should rely on your heap implementation (which in turn relies on your hash table implementation) to implement Dijkstra’s algorithm efficiently. If you have implemented these classes correctly and completely, you should not need to modify any of your heap or hash files for this assignment. You should also create a graph class that is constructed with Dijkstra’s algorithm in mind, so the implementation of the algorithm can be handled by a member function of this class. I suggest including a private nested class to store nodes in the graph. The graph can also contain a linked list of pointers to nodes. Whenever a new node is encountered, you can allocate memory for the node and add a pointer to the new node to the end of the linked list. (Alternatively, you may decide to use a linked list of nodes directly, instead of a linked list of pointers. Either way, you may use the provided C++ list class for this purpose.) One field of each node must store an adjacency list for the node. This can also use the provided linked list class. Each node in an adjacency list represents an edge, and each edge must at least specify the destination vertex and the cost of the edge. (You do not necessarily have to specify the source node in each edge, since it is the same for every node in an adjacency list.) As you are reading the edges of the graph from the input file, you will need some way to efficiently determine whether or not you have encountered each vertex id already. If not, you need to create a new vertex node. If the source vertex of the edge has been previously encountered, you have to locate the corresponding node efficiently in order to update its adjacency list. I suggest using your own hash table implementation for these tasks. Whenever a new vertex is encountered, add an entry with the new vertex id to the hash table, and use the void pointer to point to the new node. To locate a node corresponding to a source vertex, use the getPointer member function of the hash table class. I also suggest using your hash table class to determine whether or not a starting vertex entered by the user is valid. When you are implementing Dijkstra’s algorithm, I suggest using the void pointer of each heap node to point to the node corresponding to each vertex. You can then use the optional parameter of deleteMin to obtain this pointer and access the node immediately. Although you could also locate the node by just obtaining the vertex id from deleteMin and then using your hash table to obtain the pointer to the node, this is a bit less efficient, and I consider this less elegant, so I may take off a few points for this solution. After you have completed the assignment, e-mail me ([email protected]) all of your code, including a Makefile. I should be able to run “make” and then test your executable. The program is due before midnight on the night of Monday, November 19.This problem came from the 1998 regional ACM Programming Contest. As I described in class, it was the only question my Columbia team did not complete in the time limit. We had a solution which would work given unlimited time, but we did not realize it was an exponential time solution for contrived input. I am letting you know that the solution (probably) requires dynamic programming to be implemented efficiently. If you want to see how the problem was stated at the competition, check out this link: https://www.acmgnyr.org/year1998/prob_g.html The problem defines a “merge” of two strings as a third string containing all the characters from each of the original two strings mixed together. The two sets of characters can be interspersed, but the characters from each individual string cannot be permuted. For example, one possible merge of “hello” and “world” would be “wohrellold”. However, the string “wohrelldol” is not a valid merge. Although this string contains all the correct characters, and “hello” and “world” are both subsequences, there is no way to select two subsequences with distinct characters to form both of the original two strings. You are asked to write a program that accepts three strings at a time; we’ll call them A, B, and C. All strings will consist of only lowercase letters. You can assume that A and B will contain at most 1000 letters, and C will contain at most 2000 letters. Your program should determine whether or not C is a valid merge of A and B. If so, the program should output C with the characters from A converted to uppercase. If more than one merge is possible, the letters of A should be made to occur as early as possible. If no merge is possible, the output should read “*** NOT A MERGE ***”. For this assignment, your program should prompt the user for the names of an input file and an output file. The input file will consist of multiple sets of three strings, one string per line (i.e., the number of rows in the file will be a multiple of three). Your program should read three strings at a time, and it should determine whether or not the third string is a merge of the first two. The output for each set of strings should be written to the output file as specified in the previous paragraph. Your program should continue to process sets of three strings until it reaches the end of the input file. Every line in the input file will be, and every line in the output file should be, followed by a single Unix-style newline character. My major hint to you is that dynamic programming is (probably) necessary to write this program in a way such that it will run correctly for certain inputs in reasonable time. Simple algorithms will either get some cases wrong or require exponential time to run. Either top-down dynamic programming or bottom-up dynamic programming is appropriate (although you might run into stack-size problems with top-down dynamic programming on some systems). Note that you should be able to declare a global matrix (i.e., a two-dimensional array) that is big enough to handle all instances of this problem. Do not try to make the matrix a local variable; you might overflow the stack. There is no need to allocate the matrix dynamically. A sample run of the program might look like this: Enter name of input file: input.txtEnter name of output file: output.txt If the input file looks like this: chocolatechipscchocholaipteschocolatechipsbananasplitabacbadababacdhelloworldwohrelldolabbaababzzzzzzzzzzzzzzzzzzzzabzzzzzzzzzzzzzzzzzzzzaczzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzacabzzzzzzzzzzzzzzzzzzzzabczzzzzzzzzzzzzzzzzzzzacbzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzabcbac Then the output file should look exactly like this: CcHOChOLAipTEs*** NOT A MERGE ***ABAbaCd*** NOT A MERGE ***AbaBZZZZZZZZZZZZZZZZZZZZzzzzzzzzzzzzzzzzzzzzacAB*** NOT A MERGE *** The first three examples (i.e., the first nine rows of the input and the first three rows of the output) were the examples provided at the actual competition. I contrived the other four examples to show cases that make the problem more difficult. Of course, I will test your programs with several additional difficult (and in some cases, much longer) test cases. Submit your program to me via e-mail ([email protected]). I encourage you to send early presubmissions; I will reply with responses similar to those given at the contest (it may take a day or longer). The program is due before midnight on the night of Wednesday, December 5.

$25.00 View

[SOLVED] Ece 232e project 1 to 5 solutions

One can use igraph library1 to generate different networks and measure various properties of a given network. The library has R and Python implementations. You may choose either language that you prefer. However, for this project, using R is strongly recommended, as some functions might not be implemented for the Python version of the package. Submission: Upload a zip file containing your report and codes to CCLE. One submission from any member of groups is sufficient. 1https://igraph.sourceforge.net/ 1 1 Generating Random Networks 1. Create random networks using Erdös-Rényi (ER) model (a) Create an undirected random networks with n = 1000 nodes, and the probability p for drawing an edge between two arbitrary vertices 0.003, 0.004, 0.01, 0.05, and 0.1. Plot the degree distributions. What distribution is observed? Explain why. Also, report the mean and variance of the degree distributions and compare them to the theoretical values. (b) For each p and n = 1000, answer the following questions: Are all random realizations of the ER network connected? Numerically estimate the probability that a generated network is connected. For one instance of the networks with that p, find the giant connected component (GCC) if not connected. What is the diameter of the GCC? (c) It turns out that the normalized GCC size (i.e., the size of the GCC as a fraction of the total network size) is a highly nonlinear function of p, with interesting properties occurring for values where p = O( ln n n ). For n = 1000, sweep over values of p in this region and create 100 random networks for each p. Then scatter plot the normalized GCC sizes vs p. Empirically estimate the value of p where a giant connected component starts to emerge (define your criterion of “emergence”)? Do they match with theoretical values mentioned or derived in lectures? (d) i. Define the average degree of nodes c = n × p = 0.5. Sweep over number of nodes, n, ranging from 100 to 10000. Plot the expected size of the GCC of ER networks with n nodes and edge-formation probabilities p = c/n, as a function of n. What trend is observed? ii. Repeat the same for c = 1. iii. Repeat the same for values of c = 1.1, 1.2, 1.3, and show the results for these three values in a single plot. 2 2. Create networks using preferential attachment model (a) Create an undirected network with n = 1000 nodes, with preferential attachment model, where each new node attaches to m = 1 old nodes. Is such a network always connected? (b) Use fast greedy method to find the community structure. Measure modularity. (c) Try to generate a larger network with 10000 nodes using the same model. Compute modularity. How is it compared to the smaller network’s modularity? (d) Plot the degree distribution in a log-log scale for both n = 1000, 10000, then estimate the slope of the plot. (e) You can randomly pick a node i, and then randomly pick a neighbor j of that node. Plot the degree distribution of nodes j that are picked with this process, in the log-log scale. How does this differ from the node degree distribution? (f) Estimate the expected degree of a node that is added at time step i for 1 ≤ i ≤ 1000. Show the relationship between the age of nodes and their expected degree through an appropriate plot. (g) Repeat the previous parts for m = 2, and m = 5. Why was modularity for m = 1 high? (h) Again, generate a preferential attachment network with n = 1000, m = 1. Take its degree sequence and create a new network with the same degree sequence, through stub-matching procedure. Plot both networks, mark communities on their plots, and measure their modularity. Compare the two procedures for creating random power-law networks. 3. Create a modified preferential attachment model that penalizes the age of a node (a) Each time a new vertex is added, it creates m links to old vertices and the probability that an old vertex is cited depends on its degree (preferential attachment) and age. In particular, the 3 probability that a newly added vertex connects to an old vertex is proportional to: P[i] ∼ (ckα i + a)(dlβ i + b), where ki is the degree of vertex i in the current time step, and li is the age of vertex i. Produce such an undirected network with 1000 nodes and parameters m = 1, α = 1, β = −1, and a = c = d = 1, b = 0. Plot the degree distribution. What is the power law exponent? (b) Use fast greedy method to find the community structure. What is the modularity? 4 2 Random Walk on Networks 1. Random walk on Erdös-Rényi networks (a) Create an undirected random network with 1000 nodes, and the probability p for drawing an edge between any pair of nodes equal to 0.01. (b) Let a random walker start from a randomly selected node (no teleportation). We use t to denote the number of steps that the walker has taken. Measure the average distance (defined as the shortest path length) hs(t)i of the walker from his starting point at step t. Also, measure the standard deviation σ 2 (t) = h(s(t) − hs(t)i) 2 i of this distance. Plot hs(t)i v.s. t and σ 2 (t) v.s. t. Here, the average h·i is over random choices of the starting nodes. (c) Measure the degree distribution of the nodes reached at the end of the random walk. How does it compare to the degree distribution of graph? (d) Repeat (b) for undirected random networks with 100 and 10000 nodes. Compare the results and explain qualitatively. Does the diameter of the network play a role? 2. Random walk on networks with fat-tailed degree distribution (a) Generate an undirected preferential attachment network with 1000 nodes, where each new node attaches to m = 1 old nodes. (b) Let a random walker start from a randomly selected node. Measure and plot hs(t)i v.s. t and σ 2 (t) v.s. t. (c) Measure the degree distribution of the nodes reached at the end of the random walk on this network. How does it compare with the degree distribution of the graph? (d) Repeat (b) for preferential attachment networks with 100 and 10000 nodes, and m = 1. Compare the results and explain qualitatively. Does the diameter of the network play a role? 5 3. PageRank The PageRank algorithm, as used by the Google search engine, exploits the linkage structure of the web to compute global “importance” scores that can be used to influence the ranking of search results. Here, we use random walk to simulate PageRank. (a) Create a directed random network with 1000 nodes, using the preferential attachment model, where m = 4. Note that in this directed model, the out-degree of every node is m, while the in-degrees follow a power law distribution. Measure the probability that the walker visits each node. Is this probability related to the degree of the nodes? (b) In all previous questions, we didn’t have any teleportation. Now, we use a teleportation probability of α = 0.15. By performing random walks on the network created in 3(a), measure the probability that the walker visits each node. Is this probability related to the degree of the node? 4. Personalized PageRank While the use of PageRank has proven very effective, the web’s rapid growth in size and diversity drives an increasing demand for greater flexibility in ranking. Ideally, each user should be able to define their own notion of importance for each individual query. (a) Suppose you have your own notion of importance. Your interest in a node is proportional to the node’s PageRank, because you totally rely upon Google to decide which website to visit (assume that these nodes represent websites). Again, use random walk on network generated in part 3 to simulate this personalized PageRank. Here the teleportation probability to each node is proportional to its PageRank (as opposed to the regular PageRank, where at teleportation, the chance of visiting all nodes are the same and equal to 1 N ). Again, let the teleportation probability be equal to α = 0.15. Compare the results with 3(b). 6 (b) Find two nodes in the network with median PageRanks. Repeat part (a) if teleportations land only on those two nodes (with probabilities 1/2, 1/2). How are the PageRank values affected? (c) More or less, (c) is what happens in the real world, in that a user browsing the web only teleports to a set of trusted web pages. However, this is against the different assumption of normal PageRank, where we assume that people’s interest in all nodes are the same. Can you take into account the effect of this self-reinforcement and adjust the PageRank equation? 7 Final Remarks The following functions from igraph library are useful for this project: • degree, degree.distribution, diameter, vcount, ecount • random.graph.game, barabasi.game, aging.prefatt.game, degree.sequence.game • page_rank For part 2 of the project, you can start off with the Jupyter notebook provided to you. 8In this project, we will study the various properties of social networks. In the first part of the project, we will study an undirected social network (Facebook). In the second part of the project, we will study a directed social network (Google +). 1 Facebook network In this project, we will be using the dataset given below: https://snap.stanford.edu/data/egonets-Facebook.html The Facebook network can be created from the edgelist file (facebook combined.txt) 1.1 Structural properties of the facebook network Having created the facebook network, we will study some of the structural properties of the network. To be specific, we will study 1 • Connectivity • Degree distribution Question 1: Is the facebook network connected? If not, find the giant connected component (GCC) of the network and report the size of the GCC. Question 2: Find the diameter of the network. If the network is not connected, then find the diameter of the GCC. Question 3: Plot the degree distribution of the facebook network and report the average degree. Question 4: Plot the degree distribution of question 3 in a log-log scale. Try to fit a line to the plot and estimate the slope of the line. 1.2 Personalized network A personalized network of an user vi is defined as the subgraph induced by vi and it’s neighbors. In this part, we will study some of the structural properties of the personalized network of the user whose graph node ID is 1 (node ID in edgelist is 0). From this point onwards, whenever we are refering to a node ID we mean the graph node ID which is 1 + node ID in edgelist. Question 5: Create a personalized network of the user whose ID is 1. How many nodes and edges does this personalized network have? Question 6: What is the diameter of the personalized network? Please state a trivial upper and lower bound for the diameter of the personalized network. Question 7: In the context of the personalized network, what is the meaning of the diameter of the personalized network to be equal to the 2 upper bound you derived in question 6. What is the meaning of the diameter of the personalized network to be equal to the lower bound you derived in question 6? 1.3 Core node’s personalized network A core node is defined as the nodes that have more than 200 neighbors. For visualization purpose, we have displayed the personalized network of a core node below. An example of a personal network. The core node is shown in black. In this part, we will study various properties of the personalized network of the core nodes. Question 8: How many core nodes are there in the Facebook network. What is the average degree of the core nodes? 3 1.3.1 Community structure of core node’s personalized network In this part, we study the community structure of the core node’s personalized network. To be specific, we will study the community structure of the personalized network of the following core nodes: • Node ID 1 • Node ID 108 • Node ID 349 • Node ID 484 • Node ID 1087 Question 9: For each of the above core node’s personalized network, find the community structure using Fast-Greedy, Edge-Betweenness, and Infomap community detection algorithms. Compare the modularity scores of the algorithms. For visualization purpose, display the community structure of the core node’s personalized networks using colors. Nodes belonging to the same community should have the same color and nodes belonging to different communities should have different color. In this question, you should have 15 plots in total. 1.3.2 Community structure with the core node removed In this part, we will explore the effect on the community structure of a core node’s personalized network when the core node itself is removed from the personalized network. Question 10: For each of the core node’s personalized network(use same core nodes as question 9), remove the core node from the personalized network and find the community structure of the modified personalized network. Use the same community detection algorithm as question 9. 4 Compare the modularity score of the community structure of the modified personalized network with the modularity score of the community structure of the personalized network of question 9. For visualization purpose, display the community structure of the modified personalized network using colors. In this question, you should have 15 plots in total. 1.3.3 Characteristic of nodes in the personalized network In this part, we will explore characteristics of nodes in the personalized network using two measures. These two measures are stated and defined below: • Embeddedness of a node is defined as the number of mutual friends a node shares with the core node. • Dispersion of a node is defined as the sum of distances between every pair of the mutual friends the node shares with the core node. The distances should be calculated in a modified graph where the node (whose dispersion is being computed) and the core node are removed. For further details on the above characteristics, you can read the paper below: https://arxiv.org/abs/1310.6753 Question 11: Write an expression relating the Embeddedness of a node to it’s degree. Question 12: For each of the core node’s personalized network (use the same core nodes as question 9), plot the distribution of embeddedness and dispersion. In this question, you will have 10 plots. Question 13: For each of the core node’s personalized network, plot 5 the community structure of the personalized network using colors and highlight the node with maximum dispersion. Also, highlight the edges incident to this node. To detect the community structure, use FastGreedy algorithm. In this question, you will have 5 plots. Question 14: Repeat question 13, but now highlight the node with maximum embeddedness and the node with maximum dispersion embeddedness . Also, highlight the edges incident to these nodes Question 15: Use the plots from questions 13 and 14 to explain the characteristics of a node revealed by each of this measure. 1.4 Friend recommendation in personalized networks In many social networks, it is desirable to predict future links between pairs of nodes in the network. In the context of this Facebook network it is equivalent to recommending friends to users. In this part of the project, we will explore some neighborhood-based measures for friend recommendation. The network that we will be using for this part is the personalized network of node with ID 415. 1.4.1 Neighborhood based measure In this project, we will be exploring three different neighborhood-based measures. Before we define these measures, let’s introduce some notation: • Si is the neighbor set of node i in the network • Sj is the neighbor set of node j in the network Then, with the above notation we define the three measures below: 6 • Common neighbor measure between node i and node j is defined as CommonNeighbors(i, j) = |Si ∩ Sj | • Jaccard measure between node i and node j is defined as Jaccard(i, j) = |Si ∩ Sj | |Si ∪ Sj | • Adamic-Adar measure between node i and node j is defined as AdamicAdar(i, j) = X k∈Si∩Sj 1 log(|Sk|) 1.4.2 Friend recommendation using neighborhood based measures We can use the neighborhood based measures defined in the previous section to recommend new friends to users in the network. Suppose we want to recommend t new friends to some user i in the network using Jaccard measure. We follow the steps listed below: 1. For each node in the network that is not a neighbor of i, compute the jaccard measure between the node i and the node not in the neighborhood of i Compute Jaccard(i, j) ∀j ∈ S C i 2. Then pick t nodes that have the highest jaccard measure with node i and recommend these nodes as friends to node i 1.4.3 Creating the list of users Having defined the friend recommendation procedure, we can now apply it to the personalized network of node ID 415. Before we apply 7 the algorithm, we need to create the list of users who we want to recommend new friends to. We create this list by picking all nodes with degree 24. We will denote this list as Nr. Question 16: What is |Nr|? 1.4.4 Average accuracy of friend recommendation algorithm In this part, we will apply the 3 different types of friend recommendation algorithms to recommend friends to the users in the list Nr. We will define an average accuracy measure to compare the performances of the friend recommendation algorithms. Suppose we want to compute the average accuracy of the friend recommendation algorithm. This task is completed in two steps: 1. Compute the average accuracy for each user in the list Nr 2. Compute the average accuracy of the algorithm by averaging across the accuracies of the users in the list Nr Let’s describe the procedure for accomplishing the step 1 of the task. Suppose we want to compute the average accuracy for user i in the list Nr. We can compute it by iterating over the following steps 10 times and then taking the average: 1. Remove each edge of node i at random with probability 0.25. In this context, it is equivalent to deleting some friends of node i. Let’s denote the list of friends deleted as Ri 2. Use one of the three neighborhood based measures to recommend |Ri | new friends to the user i. Let’s denote the list of friends recommended as Pi 3. The accuracy for the user i for this iteration is given by |Pi∩Ri | |Ri | 8 By iterating over the above steps for 10 times and then taking the average gives us the average accuracy of user i. In this manner, we compute the average accuracy for each user in the list Nr. Once we have computed them, then we can take the mean of the average accuracies of the users in the list Nr. The mean value will be the average accuracy of the friend recommendation algorithm. Question 17: Compute the average accuracy of the friend recommendation algorithm that uses: • Common Neighbors measure • Jaccard measure • Adamic Adar measure Based on the average accuracy values, which friend recommendation algorithm is the best? 2 Google+ network In this part, we will explore the structure of the Google+ network. The dataset for creating the network can be found in the link below: https://snap.stanford.edu/data/egonets-Gplus.html Create directed personal networks for users who have more than 2 circles. The data required to create such personal networks can be found in the file named gplus.tar.gz. Question 18: How many personal networks are there? Question 19: For the 3 personal networks (node ID given below), plot the in-degree and out-degree distribution of these personal networks. 9 Do the personal networks have a similar in and out degree distribution. In this question, you should have 6 plots. • 109327480479767108490 • 115625564993990145546 • 101373961279443806744 2.1 Community structure of personal networks In this part of the project, we will explore the community structure of the personal networks that we created and explore the connections between communities and user circles. Question 20: For the 3 personal networks picked in question 19, extract the community structure of each personal network using Walktrap community detection algorithm. Report the modularity scores and plot the communities using colors. Are the modularity scores similar? In this question, you should have 3 plots. Having found the communities, now we will explore the relationship between circles and communities. In order to explore the relationship, we define two measures: • Homogeneity • Completeness Before, we state the expression for homogeneity and completeness, let’s introduce some notation: • C is the set of circles, C = {C1, C2, C3, · · · } 10 • K is the set of communities, K = {K1, K2, K3, · · · } • ai is the number of people in circle Ci • bi is the number of people in community Ki with circle information • N is the total number of people with circle information • Cji is the number of people belonging to community j and circle i Then, with the above notation, we have the following expressions for the entropy H(C) = − X |C| i=1 ai N log( ai N ) (1) H(K) = − X |K| i=1 bi N log( bi N ) (2) and conditional entropy H(C|K) = − X |K| j=1 X |C| i=1 Cji N log(Cji bj ) (3) H(K|C) = − X |C| i=1 X |K| j=1 Cji N log(Cji ai ) (4) Now we can state the expression for homogeneity, h as h = 1 − H(C|K) H(C) (5) and the expression for completeness, c as c = 1 − H(K|C) H(K) (6) Question 21: Based on the expression for h and c, explain the meaning of homogeneity and completeness in words. Question 22: Compute the h and c values for the community structures of the 3 personal network (same nodes as question 19). Interpret the values and provide a detailed explanation. 11 3 Submission Please submit a zip file containing your codes and report to CCLE. The zip file should be named as “Project2_UID1_…_UIDn.zip” where UIDx are student ID numbers of team members. If you had any questions you can post on piazza.1 Introduction Reinforcement Learning (RL) is the task of learning from interaction to achieve a goal. The learner and the decision maker is called the agent. The thing it interacts with, comprising everything outside the agent, is called the environment. These interact continually, the agent selecting actions and the environment responding to those actions by presenting rewards and new states. In the first part of the project, we will learn the optimal policy of an agent navigating in a 2-D environment. We will implement the Value iteration algorithm to learn the optimal policy. Inverse Reinforcement Learning (IRL) is the task of extracting an expert’s reward function by observing the optimal policy of the expert. In the second part of the project, we will explore the application of IRL in the context of apprenticeship learning. 2 Reinforcement learning (RL) The two main objects in Reinforcement learning are: • Agent • Environment In this project, we will learn the optimal policy of a single agent navigating in a 2-D environment. 2.1 Environment In this project, we assume that the environment of the agent is modeled by a Markov Decision Process (MDP). In a MDP, agents occupy a state of the environment and perform actions to change the state they are in. After taking an action, they are given some representation of the new state and some reward value associated with the new state. 1 An MDP formally is a tuple (S, A,Pa ss0 , Ra ss0 , ) where: • S is a set of states • A is a set of actions • Pa ss0 is a set of transition probabilities, where Pa ss0 is the probability of transitioning from state s 2 S to state s0 2 S after taking action a 2 A – Pa ss0 = P(st+1 = s0 |st = s, at = a) • Given any current state and action, s and a, together with any next state, s0 , the expected value of the next reward is Ra ss0 – Ra ss0 = E(rt+1|st = s, at = a, st+1 = s0 ) • 2 [0, 1) is the discount factor, and it is used to compute the present value of future reward – If is close to 1 then the future rewards are discounted less – If is close to 0 then the future rewards are discounted more In the next few subsections, we will discuss the parameters that will be used to generate the environment for the project. 2.1.1 State space In this project, we consider the state space to be a 2-D square grid with 100 states. The 2-D square grid along with the numbering of the states is shown in figure 1 Figure 1: 2-D square grid with state numbering 2.1.2 Action set In this project, we consider the action set(A) to contain the 4 following actions: • Move Right • Move Left 2 • Move Up • Move Down The 4 types of actions are displayed in figure 2 Figure 2: 4 types of action From the above figure, we can see that the agent can take 4 actions from the state marked with a dot. 2.1.3 Transition probabilities In this project, we define the transition probabilities in the following manner: 1. If state s0 and s are not neighboring states in the 2-D grid, then P(st+1 = s0 |st = s, at = a)=0 s0 and s are neighbors in the 2-D grid if you can move to s0 from s by taking an action a from the action set A. We will consider a state s to be a neighbor of itself. For example, from figure 1 we can observe that states 1 and 11 are neighbors (we can transition from 1 to 11 by moving right) but states 1 and 12 are not neighbors. 2. Each action corresponds to a movement in the intended direction with probability 1 w, but has a probability of w of moving in a random direction instead due to wind. To illustrate this, let’s consider the states shown in figure 3 3 Figure 3: Inner grid states (Non-boundary states) The transition probabilities for the non-boundary states shown in figure 3 are given below: P(st+1 = 43|st = 44, at =”)=1 w + w 4 P(st+1 = 34|st = 44, at =”) = w 4 P(st+1 = 54|st = 44, at =”) = w 4 P(st+1 = 45|st = 44, at =”) = w 4 From the above calculation it can be observed that if the agent is at a nonboundary state then it has 4 neighbors excluding itself and the probability w is uniformly distributed over the 4 neighbors. Also, if the agent is at a non-boundary state then it transitions to a new state after taking an action (P(st+1 = 44|st = 44, at =”) = 0) 3. If the agent is at one of the four corner states (0,9,90,99), the agent stays at the current state if it takes an action to move o↵ the grid or is blown o↵ the grid by wind. The actions can be divided into two categories: • Action to move o↵ the grid • Action to stay in the grid To illustrate this, let’s consider the states shown in figure 4 4 Figure 4: Corner states The transition probabilities for taking an action to move o↵ the grid are given below: P(st+1 = 10|st = 0, at =”) = w 4 P(st+1 = 1|st = 0, at =”) = w 4 P(st+1 = 0|st = 0, at =”)=1 w + w 4 + w 4 The transition probabilities for taking an action to stay in the grid are given below: P(st+1 = 10|st = 0, at =!)=1 w + w 4 P(st+1 = 1|st = 0, at =!) = w 4 P(st+1 = 0|st = 0, at =!) = w 4 + w 4 At a corner state, you can be blown o↵ the grid in two directions. As a result, we have P(st+1 = 0|st = 0, at =!) = w 4 + w 4 since we can be blown o↵ the grid in two directions and in both the cases we stay at the current state. 4. If the agent is at one of the edge states, the agent stays at the current state if it takes an action to move o↵ the grid or is blown o↵ the grid by wind. The actions can be divided into two categories: • Action to move o↵ the grid • Action to stay in the grid To illustrate this, let’s consider the states shown in figure 5 5 Figure 5: Edge states The transition probabilities for taking an action to move o↵ the grid are given below: P(st+1 = 0|st = 1, at = ) = w 4 P(st+1 = 11|st = 1, at = ) = w 4 P(st+1 = 2|st = 1, at = ) = w 4 P(st+1 = 1|st = 1, at = )=1 w + w 4 The transition probabilities for taking an action to stay in the grid are given below: P(st+1 = 0|st = 1, at =”)=1 w + w 4 P(st+1 = 11|st = 1, at =”) = w 4 P(st+1 = 2|st = 1, at =”) = w 4 P(st+1 = 1|st = 1, at =”) = w 4 At an edge state, you can be blown o↵ the grid in one direction. As a result, we have P(st+1 = 1|st = 1, at =”) = w 4 since we can be blown o↵ the grid in one direction and in that case we stay at the current state. The main di↵erence between a corner state and an edge state is that a corner state has 2 neighbors and an edge state has 3 neighbors. 2.1.4 Reward function To simplify the project, we will assume that the reward function is independent of the current state (s) and the action that you take at the current state (a). To be specific, reward function only depends on the state that you transition to (s0 ). With this simplification, we have Ra ss0 = R(s0 ) 6 In this project, we will learn the optimal policy of an agent for two di↵erent reward functions: • Reward function 1 • Reward function 2 The two di↵erent reward functions are displayed in figures 6 and 7 respectively Figure 6: Reward function 1 Figure 7: Reward function 2 Question 1: (10 points) For visualization purpose, generate heat maps of Reward function 1 and Reward function 2. For the heat maps, make sure you display the coloring scale. You will have 2 plots for this question For solving question 1, you might find the following function useful: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.pcolor.html 7 3 Optimal policy learning using RL algorithms In this part of the project, we will use reinforcement learning (RL) algorithm to find the optimal policy. The main steps in RL algorithm are: • Find optimal state-value or action-value • Use the optimal state-value or action-value to determine the deterministic optimal policy There are a couple of RL algorithms, but we will use the Value iteration algorithm since it was discussed in detail in the lecture. We will skip the derivation of the algorithm here because it was covered in the lecture (for the derivation details please refer to the lecture slides on Reinforcement learning). We will just reproduce the algorithm below for the ease of implementation: 1: procedure Value Iteration(Pa ss0 , Ra ss0 , S, A, ): 2: for all s 2 S do . Initialization 3: V (s) 0 4: end for 5: 1 6: while > ✏ do . Estimation 7: 0 8: for all s 2 S do 9: v V (s); 10: V (s) max a2A P s02S Pa ss0 [Ra ss0 + V (s0 )]; 11: max(, |v V (s)|); 12: end for 13: end while 14: for all s 2 S do . Computation 15: ⇡(s) arg max a2A P s02S Pa ss0 [Ra ss0 + V (s0 )]; 16: end for 17: end procedure return ⇡ Question 2: (40 points) Create the environment of the agent using the information provided in section 2. To be specific, create the MDP by setting up the state-space, action set, transition probabilities, discount factor, and reward function. For creating the environment, use the following set of parameters: • Number of states = 100 (state space is a 10 by 10 square grid as displayed in figure 1) • Number of actions = 4 (set of possible actions is displayed in figure 2) • w = 0.1 • Discount factor = 0.8 • Reward function 1 8 After you have created the environment, then write an optimal state-value function that takes as input the environment of the agent and outputs the optimal value of each state in the grid. For the optimal state-value function, you have to implement the Initialization (lines 2-4) and Estimation (lines 5-13) steps of the Value Iteration algorithm. For the estimation step, use ✏ = 0.01. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal value of that state. In this question, you should have 1 plot. Question 3: (5 points) Generate a heat map of the optimal state values across the 2-D grid. For generating the heat map, you can use the same function provided in the hint earlier (see the hint after question 1). Question 4: (15 points) Explain the distribution of the optimal state values across the 2-D grid. (Hint: Use the figure generated in question 3 to explain) Question 5: (30 points) Implement the computation step of the value iteration algorithm (lines 14-17) to compute the optimal policy of the agent navigating the 2-D state-space. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal action at that state. The optimal actions should be displayed using arrows. Does the optimal policy of the agent match your intuition? Please provide a brief explanation. Is it possible for the agent to compute the optimal action to take at each state by observing the optimal values of it’s neighboring states? In this question, you should have 1 plot. Question 6: (10 points) Modify the environment of the agent by replacing Reward function 1 with Reward function 2. Use the optimal state-value function implemented in question 2 to compute the optimal value of each state in the grid. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal value of that state. In this question, you should have 1 plot. Question 7: (10 points) Generate a heat map of the optimal state values (found in question 6) across the 2-D grid. For generating the heat map, you can use the same function provided in the hint earlier. Question 8: (20 points) Explain the distribution of the optimal state values across the 2-D grid. (Hint: Use the figure generated in question 7 to explain) Question 9: (20 points) Implement the computation step of the value iteration algorithm (lines 14-17) to compute the optimal policy of the agent navigating the 2-D state-space. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal action at that state. The optimal actions should be displayed using arrows. Does the optimal policy of the agent match your intuition? Please provide a brief explanation. In this question, you should have 1 plot. 9 4 Inverse Reinforcement learning (IRL) Inverse Reinforcement learning (IRL) is the task of learning an expert’s reward function by observing the optimal behavior of the expert. The motivation for IRL comes from apprenticeship learning. In apprenticeship learning, the goal of the agent is to learn a policy by observing the behavior of an expert. This task can be accomplished in two ways: 1. Learn the policy directly from expert behavior 2. Learn the expert’s reward function and use it to generate the optimal policy The second way is preferred because the reward function provides a much more parsimonious description of behavior. Reward function, rather than the policy, is the most succinct, robust, and transferable definition of the task. Therefore, extracting the reward function of an expert would help design more robust agents. In this part of the project, we will use IRL algorithm to extract the reward function. We will use the optimal policy computed in the previous section as the expert behavior and use the algorithm to extract the reward function of the expert. Then, we will use the extracted reward function to compute the optimal policy of the agent. We will compare the optimal policy of the agent to the optimal policy of the expert and use some similarity metric between the two to measure the performance of the IRL algorithm. 4.1 IRL algorithm For finite state spaces, there are a couple of IRL algorithms for extracting the reward function: • Linear Programming (LP) formulation • Maximum Entropy formulation Since we covered LP formulation in the lecture and it is the simplest IRL algorithm, so we will use the LP formulation in this project. We will skip the derivation of the algorithm (for details on the derivation please refer to the lecture slides) here. The LP formulation of the IRL is given by equation 1 maximize R,ti,ui P|S| i=1(ti ui) subject to [(Pa1 (i) Pa(i))(I Pa1 )1R] ti, 8a 2 A a1, 8i (Pa1 Pa)(I Pa1 )1R ⌫ 0, 8a 2 A a1 u R u |Ri|  Rmax, i = 1, 2, ··· , |S| (1) In the LP given by equation 1, R is the reward vector (R(i) = R(si)), Pa is the transition probability matrix, is the adjustable penalty coecient, and ti’s and ui’s are the extra optimization variables (please note that u(i) = ui). Use the maximum absolute value of the ground truth reward as Rmax. For the 10 ease of implementation, we can recast the LP in equation 1 into an equivalent form given by equation 2 using block matrices. maximize x cT x subject to Dx 0, 8a 2 A a1 (2) Question 10: (10 points) Express c, x, D in terms of R, Pa, Pa1 , ti, u, and Rmax 4.2 Performance measure In this project, we use a very simple measure to evaluate the performance of the IRL algorithm. Before we state the performance measure, let’s introduce some notation: • OA(s): Optimal action of the agent at state s • OE(s): Optimal action of the expert at state s • m(s) = ( 1, OA(s) = OE(s) 0, else Then with the above notation, accuracy is given by equation 3 Accuracy = P s2S m(s) |S| (3) Since we are using the optimal policy found in the previous section as the expert behavior, so we will use the optimal policy found in the previous section to fill the OE(s) values. Please note that these values will be di↵erent depending on whether we used Reward Function 1 or Reward Function 2 to create the environment. To compute OA(s), we will solve the linear program given by equation 2 to extract the reward function of the expert. For solving the linear program you can use the LP solver in python (from cvxopt import solvers and then use solvers.lp). Then, we will use the extracted reward function to compute the optimal policy of the agent using the value iteration algorithm you implemented in the previous section. The optimal policy of the agent found in this manner will be used to fill the OA(s) values. Please note that these values will depend on the adjustable penalty coecient . We will tune to maximize the accuracy. Question 11: (30 points) Sweep from 0 to 5 to get 500 evenly spaced values for . For each value of compute OA(s) by following the process described above. For this problem, use the optimal policy of the agent found in question 5 to fill in the OE(s) values. Then use equation 3 to compute the accuracy of the IRL algorithm for this value of . You need to repeat the above process for all 500 values of to get 500 data points. Plot (x-axis) against Accuracy (y-axis). In this question, you should have 1 plot. 11 Question 12: (5 points) Use the plot in question 11 to compute the value of for which accuracy is maximum. For future reference we will denote this value as (1) max. Please report (1) max Question 13: (15 points) For (1) max, generate heat maps of the ground truth reward and the extracted reward. Please note that the ground truth reward is the Reward function 1 and the extracted reward is computed by solving the linear program given by equation 2 with the parameter set to (1) max. In this question, you should have 2 plots. Question 14: (10 points) Use the extracted reward function computed in question 13, to compute the optimal values of the states in the 2-D grid. For computing the optimal values you need to use the optimal state-value function that you wrote in question 2. For visualization purpose, generate a heat map of the optimal state values across the 2-D grid (similar to the figure generated in question 3). In this question, you should have 1 plot. Question 15: (10 points) Compare the heat maps of Question 3 and Question 14 and provide a brief explanation on their similarities and di↵erences. Question 16: (10 points) Use the extracted reward function found in question 13 to compute the optimal policy of the agent. For computing the optimal policy of the agent you need to use the function that you wrote in question 5. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal action at that state. The actions should be displayed using arrows. In this question, you should have 1 plot. Question 17: (10 points) Compare the figures of Question 5 and Question 16 and provide a brief explanation on their similarities and di↵erences. Question 18: (30 points) Sweep from 0 to 5 to get 500 evenly spaced values for . For each value of compute OA(s) by following the process described above. For this problem, use the optimal policy of the agent found in question 9 to fill in the OE(s) values. Then use equation 3 to compute the accuracy of the IRL algorithm for this value of . You need to repeat the above process for all 500 values of to get 500 data points. Plot (x-axis) against Accuracy (y-axis). In this question, you should have 1 plot. Question 19: (5 points) Use the plot in question 18 to compute the value of for which accuracy is maximum. For future reference we will denote this value as (2) max. Please report (2) max Question 20: (15 points) For (2) max, generate heat maps of the ground truth reward and the extracted reward. Please note that the ground truth reward is the Reward function 2 and the extracted reward is computed by solving the linear program given by equation 2 with the parameter set to (2) max. In this question, you should have 2 plots. Question 21: (10 points) Use the extracted reward function computed in ques12 tion 20, to compute the optimal values of the states in the 2-D grid. For computing the optimal values you need to use the optimal state-value function that you wrote in question 2. For visualization purpose, generate a heat map of the optimal state values across the 2-D grid (similar to the figure generated in question 7). In this question, you should have 1 plot. Question 22: (10 points) Compare the heat maps of Question 7 and Question 21 and provide a brief explanation on their similarities and di↵erences. Question 23: (10 points) Use the extracted reward function found in question 20 to compute the optimal policy of the agent. For computing the optimal policy of the agent you need to use the function that you wrote in question 9. For visualization purpose, you should generate a figure similar to that of figure 1 but with the number of state replaced by the optimal action at that state. The actions should be displayed using arrows. In this question, you should have 1 plot. Question 24: (10 points) Compare the figures of Question 9 and Question 23 and provide a brief explanation on their similarities and di↵erences. Question 25: (50 points) From the figure in question 23, you should observe that the optimal policy of the agent has two major discrepancies. Please identify and provide the causes for these two discrepancies. One of the discrepancy can be fixed easily by a slight modification to the value iteration algorithm. Perform this modification and re-run the modified value iteration algorithm to compute the optimal policy of the agent. Also, recompute the maximum accuracy after this modification. Is there a change in maximum accuracy? The second discrepancy is harder to fix and is a limitation of the simple IRL algorithm. If you can provide a solution to the second discrepancy then we will give you a bonus of 50 points. 5 Submission Please submit a zip file containing your codes and report to CCLE. The zip file should be named as ”Project2 UID1 … UIDn.zip” where UIDx are student ID numbers of team members.In this project, we will study the various properties of Internet Movie Database (IMDb). In the first part of the project, we will explore the properties of a directed actor/actress network. In the second part of the project, we will explore the properties of an undirected movie network. 1 Actor/Actress network In this part of the project, we will create the network using the data from the following text files: • actor_movies.txt • actress_movies.txt The text files can be downloaded from the following link: https:// ucla.box.com/s/z45q3g5zrpay8b8gtbql6ojaecb7kj2u In order to create the network in a consistent manner, you will need to do some data preprocessing. The preprocessing consists of 2 parts: 1. Merging the two text files into one and then removing the actor/actress 1 who has acted in less than 10 movies 2. Cleaning the merged text file The cleaning part is necessary to avoid inconsistency in the network creation. If you analyze the merged text file, then you will observe that same movie might be counted multiple times due to the role of the actor/actress in that movie. For example, we might have • Movie X (voice) • Movie X (as uncredited) If you don’t clean the merged text file, then Movie X (voice) and Movie X (as uncredited) will be considered as different movies. Therefore, you will need to perform some cleaning operations to remove inconsistencies of various types. Question 1: Perform the preprocessing on the two text files and report the total number of actors and actresses and total number of unique movies that these actors and actresses have acted in. 1.1 Directed actor/actress network creation We will use the processed text file to create the directed actor/actress network. The nodes of the network are the actor/actress and there are weighted edges between the nodes in the network. The weights of the edges are given by equation 1 wi→j = |Si ∩ Sj | |Si | (1) where Si is the set of movies in which actor/actress vi has acted in and Sj is the set of movies in which actor/actress vj has acted in. Question 2: Create a weighted directed actor/actress network using the 2 processed text file and equation 1. Plot the in-degree distribution of the actor/actress network. Briefly comment on the in-degree distribution. 1.2 Actor pairings In this section, we will try to find the pairings between actors. We will consider the following 10 actors: • Tom Cruise • Emma Watson (II) • George Clooney • Tom Hanks • Dwayne Johnson (I) • Johnny Depp • Will Smith (I) • Meryl Streep • Leonardo DiCaprio • Brad Pitt Question 3: Design a simple algorithm to find the actor pairings. To be specific, your algorithm should take as input one of the actors listed above and should return the name of the actor with whom the input actor prefers to work the most. Run your algorithm for the actors listed above and report the actor names returned by your algorithm. Also for each pair, report the (input actor, output actor) edge weight. Does all the actor pairing make sense? 3 1.3 Actor rankings In this section, we will extract the top 10 actor/actress from the network. Question 4: Use the google’s pagerank algorithm to find the top 10 actor/actress in the network. Report the top 10 actor/actress and also the number of movies and the in-degree of each of the actor/actress in the top 10 list. Does the top 10 list have any actor/actress listed in the previous section? If it does not have any of the actor/actress listed in the previous section, please provide an explanation for this phenomenon. Question 5: Report the pagerank scores of the actor/actress listed in the previous section. Also, report the number of movies each of these actor/actress have acted in and also their in-degree. 2 Movie network In this part, we will create an undirected movie network and then explore the various structural properties of the network. 2.1 Undirected movie network creation We will use the processed text files from the previous section to create the movie network. The nodes of the network are the movies and there are weighted edges between the nodes in the network. To reduce the size of the network, we will only consider movies that has at least 5 actor/actress in it. The weights of the edges are given by equation 2 wi→j = |Ai ∩ Aj | |Ai ∪ Aj | (2) 4 where Ai is the set of actors in movie vi and Aj is the set of actors in movie vj . Since, wi→j = wj→i so we have an undirected network. Question 6: Create a weighted undirected movie network using equation 2. Plot the degree distribution of the movie network. Briefly comment on the degree distribution. 2.2 Communities in the movie network In this part, we will extract the communities in the movie network and explore their relationship with the movie genre. For this part you will need to load the movie_genre.txt file. Question 7: Use the Fast Greedy community detection algorithm to find the communities in the movie network. Pick 10 communities and for each community plot the distribution of the genres of the movies in the community. Question 8(a): In each community determine the most dominant genre based simply on frequency counts. Which generes tend to be the most frequent dominant ones across communities and why? Question 8(b): In each community, for the i th genre assign a score of ln(c(i)) ∗ p(i) q(i) where: c(i) is the number of movies belonging to genre i in the community; p(i) is the fraction of genre i movies in the community, and q(i) is the fraction of genre i movies in the entire data set. Now determine the most dominant genre in each communitiy based on the modified scores. What are your findings and how do they differ from the results in 8(a). Question 8(c): Find a community of movies that has size between 10 and 20. Determine all the actors who acted in these movies and plot the 5 corresponding bipartite graph (i.e. restricted to these particular movies and actors). Determine three most important actors and explain how they help form the community. Is there a correlation between these actors and the dominant genres you found for this community in 8(a) and 8(b). 2.3 Neighborhood analysis of movies In this part of the project, you will need to load the movie_rating.txt file and we will explore the neighborhood of the following 3 movies: • Batman v Superman: Dawn of Justice (2016); Rating: 6.6 • Mission: Impossible – Rogue Nation (2015); Rating: 7.4 • Minions (2015); Rating: 6.4 Question 9: For each of the movies listed above, extract it’s neighbors and plot the distribution of the available ratings of the movies in the neighborhood. Is the average rating of the movies in the neighborhood similar to the rating of the movie whose neighbors have been extracted? In this question, you should have 3 plots. Question 10: Repeat question 10, but now restrict the neighborhood to consist of movies from the same community. Is there a better match between the average rating of the movies in the restricted neighborhood and the rating of the movie whose neighbors have been extracted. In this question, you should have 3 plots. Question 11: For each of the movies listed above, extract it’s top 5 neighbors and also report the community membership of the top 5 neighbors. In this question, the sorting is done based on the edge weights. 6 2.4 Predicting ratings of movies In this part of the project, we will explore various rating prediction techniques to predict the ratings of the following 3 movies: • Batman v Superman: Dawn of Justice (2016) • Mission: Impossible – Rogue Nation (2015) • Minions (2015) Question 12: Train a regression model to predict the ratings of movies: for the training set you can pick any subset of movies with available ratings as the target variables; you have to specify the exact feature set that you use to train the regression model and report the root mean squared error (RMSE). Now use this trained model to predict the ratings of the 3 movies listed above (which obviously should not be included in your training data). We will now predict the ratings of the movies using a different approach. To be specific, we will use a bipartite graph approach for rating prediction. In a bipartite graph, G = (V, E), we have a partition of the vertex set such that V1 ∪ V2 = V V1 ∩ V2 = ∅ and eij = (vi , vj ) where vi ∈ V1 and vj ∈ V2. In a bipartite graph, vertices belonging to the same partitioned set are non-adjacent. In this project, we will create a bipartite graph in the following manner: • V1 represents the set of actor/actresses 7 • V2 represents the set of movies • There is an edge eij between a node in V1 and V2 if the actor i has acted in movie j Question 13: Create a bipartite graph following the procedure described above. Determine and justify a metric for assigning a weight to each actor. Then, predict the ratings of the 3 movies using the weights of the actors in the bipartite graph. Report the RMSE. Is this rating mechanism better than the one in question 12? Justify your answer.Introduction In this project we will explore graph theory theorems and algorithms, by applying them on real data. In the first part of the project, we consider a particular graph which models correlations between stock price time series. In the second part, we analyse traffic data on a dataset provided by Uber. 1 Stock Market In this part of the project, we study data from stock market. The data is available on this Dropbox Link. The goal of this part is to study correlation structures among fluctuation patterns of stock prices using tools from graph theory. The intuition is that investors will have similar strategies of investment for stocks that are effected by the same economic factors. For example, the stocks belonging the transportation sector may have different absolute prices, but if for example fuel prices change or are expected to change significantly in the near future, 1 then you would expect the investors to buy or sell all stocks similarly and maximize their returns. Towards that goal, we construct different graphs based on similarities among the time series of returns on different stocks at different time scales (day vs a week). Then, we study properties of such graphs. The data is obtained from Yahoo Finance website for 3 years. You’re provided with a number of csv tables, each containing several fields: Date, Open, High, Low, Close, Volume, and Adj Close price. The files are named according to Ticker Symbol of each stock. You may find the market sector for each company in Name_sector.csv. 1.1 Return correlation In this part of the project, we will compute the correlation among lognormalized stock-return time series data. Before giving the expression for correlation, we introduce the following notation: • pi(t) is the closing price of stock i at the t th day • qi(t) is the return of stock i over a period of [t − 1, t] qi(t) = pi(t) − pi(t − 1) pi(t − 1) • ri(t) is the log-normalized return stock i over a period of [t − 1, t] ri(t) = log(1 + qi(t)) Then with the above notation, we define the correlation between the log-normalized stock-return time series data of stocks i and j as ρij = hri(t)rj (t)i − hri(t)ihrj (t)i p (hri(t) 2i − hri(t)i 2 )(hrj (t) 2i − hrj (t)i 2 ) where h·i is a temporal average on the investigated time regime (for our data set it is over 3 years). Question 1: Provide an upper and lower bound on ρij . Also, provide a justification for using log-normalized return (ri(t)) instead of regular return (qi(t)). 2 1.2 Constructing correlation graphs In this part,we construct a correlation graph using the correlation coefficient computed in the previous section. The correlation graph has the stocks as the nodes and the edge weights are given by the following expression wij = q 2(1 − ρij ) Compute the edge weights using the above expression and construct the correlation graph. Question 2: Plot the degree distribution of the correlation graph and a histogram showing the un-normalized distribution of edge weights. 1.3 Minimum spanning tree (MST) In this part of the project, we will extract the MST of the correlation graph and interpret it. Question 3: Extract the MST of the correlation graph. Each stock can be categorized into a sector, which can be found in Name_sector.csv file. Plot the MST and color-code the nodes based on sectors. Do you see any pattern in the MST? The structures that you find in MST are called Vine clusters. Provide a detailed explanation about the pattern you observe. 1.4 Sector clustering in MST’s In this part, we want to predict the market sector of an unknown stock. We will explore two methods for performing the task. In order to eval3 uate the performance of the methods we define the following metric α = 1 |V | X vi∈V P(vi ∈ Si) where Si is the sector of node i. Define P(vi ∈ Si) = |Qi | |Ni | where Qi is the set of neighbors of node i that belong to the same sector as node i and Ni is the set of neighbors of node i. Compare α with the case where P(vi ∈ Si) = |Si | |V | Question 4: Report the value of α for the above two cases and provide an interpretation for the difference. 1.5 Correlation graphs for weekly data In the previous parts, we constructed the correlation graph based on daily data. In this part of the project, we will construct a correlation graph based on weekly data. To create the graph, sample the stock data weekly on Mondays and then calculate ρij using the sampled data. If there is a holiday on a Monday, we ignore that week. Create the correlation graph based on weekly data. Question 5: Extract the MST from the correlation graph based on weekly data. Compare the pattern of this MST with the pattern of the MST found in question 3. 2 Let’s Help Santa! Companies like Google and Uber have a vast amount of statistics about transportation dynamics. Santa has decided to use network theory to 4 facilitate his gift delivery for the next christmas. When we learned about his decision, we designed this part of the project to help him. We will send him your results for this part! 2.1 Download the Data Go to “Uber Movement” website and download data of Monthly Aggregate (all days), 2017 Quarter 4, for San Francisco area 1 . The dataset contains pairwise traveling time statistics between most pairs of points in San Francisco area. Points on the map are represented by unique IDs. To understand the correspondence between map IDs and areas, download Geo Boundaries file from the same website 2 . This file contains latitudes and longitudes of the corners of the polygons circumscribing each area. In addition, it contains one street address inside each area, referred to as DISPLAY_NAME. To be specific, if an area is represented by a polygon with 5 corners, then you have a 5×2 matrix of the latitudes and longitudes, each row of which represents latitude and longitude of one corner. 2.2 Build Your Graph Read the dataset at hand, and build a graph in which nodes correspond to locations, and undirected weighted edges correspond to the mean traveling times between each pair of locations (only December). Add the following attributes to the vertices: 1. Display name: the street address 2. Location: mean of the coordinates of the polygon’s corners (a 2-D vector) 1 If you download the dataset correctly, it should be named as san_francisco-censustracts-2017-4-All-MonthlyAggregate.csv 2The file should be named SAN_FRANCISCO_CENSUSTRACTS.JSON 5 The graph will contain some isolated nodes (extra nodes existing in the Geo Boundaries JSON file) and a few small connected components. Remove such nodes and just keep the giant connected component of the graph. In addition, merge duplicate edges by averaging their weights 3 . We will refer to this cleaned graph as G afterwards. Question 6: Report the number of nodes and edges in G. 2.3 Traveling Salesman Problem Question 7: Build a minimum spanning tree (MST) of graph G. Report the street addresses of the two endpoints of a few edges. Are the results intuitive? Question 8: Determine what percentage of triangles in the graph (sets of 3 points on the map) satisfy the triangle inequality. You do not need to inspect all triangles, you can just estimate by random sampling of 1000 triangles. Now, we want to find an approximation solution for the traveling salesman problem (TSP) on G. Apply the 2-approximate algorithm described in the class4 . Inspect the sequence of street addresses visited on the map and see if the results are intuitive. Question 9: Find the empirical performance of the approximate algorithm: ρ = Approximate TSP Cost Optimal TSP Cost Question 10: Plot the trajectory that Santa has to travel! 3Duplicate edges may exist when the dataset provides you with the statistic of a road in both directions. We remove duplicate edges for the sake of simplicity. 4You can find the algorithm in: Papadimitriou and Steiglitz, “Combinatorial optimization: algorithms and complexity”, Chapter 17, page 414 6 3 Analysing the Traffic Flow Next December, there is going to be a sport event between Stanford University and University of California, Santa Cruz (UCSC). A large number of students are enthusiastic about the event, which is going to be held in UCSC. Stanford fans want to drive from their campus to the rival’s. We would like to analyse the maximum traffic that can flow from Stanford to UCSC. 3.1 Estimate the Roads We want to estimate the map of roads without using actual road datasets. Educate yourself about Delaunay triangulation algorithm and then apply it to the nodes coordinates5 . Question 11: Plot the road mesh that you obtain and explain the result. Create a subgraph G∆ induced by the edges produced by triangulation. 3.2 Calculate Road Traffic Flows Question 12: Using simple math, calculate the traffic flow for each road in terms of cars/hour. Hint: Consider the following assumptions: • Each degree of latitude and longitude ≈ 69 miles • Car length ≈ 5 m = 0.003 mile • Cars maintain a safety distance of 2 seconds to the next car • Each road has 2 lanes in each direction 5You can use scipy.spatial.Delaunay in python 7 Assuming no traffic jam, consider the calculated traffic flow as the max capacity of each road. 3.3 Calculate the Max Flow Consider the following addresses: • Source address: 100 Campus Drive, Stanford • Destination address: 700 Meder Street, Santa Cruz Question 13: Calculate the maximum number of cars that can commute per hour from Stanford to UCSC. Also calculate the number of edge-disjoint paths between the two spots. Does the number of edgedisjoint paths match what you see on your road map? 3.4 Defoliate Your Graph In G∆, there are a number of unreal roads that could be removed. For instance, there are many fake bridges crossing the bay. Apply a threshold on the travel time of the roads in G∆ to remove the fake edges. Trim the fake edges and call the resulting graph G˜∆. Question 14: Plot G˜∆ on real map coordinates. Are real bridges preserved? Hint: You can consider the following coordinates: • Golden Gate Bridge: [[-122.475, 37.806], [-122.479, 37.83]] • Richmond, San Rafael Bridge: [[-122.501, 37.956], [-122.387, 37.93]] 8 • San Mateo Bridge: [[-122.273, 37.563], [-122.122, 37.627]] • Dambarton Bridge: [[-122.142, 37.486], [-122.067, 37.54]] • San Francisco – Oakland Bay Bridge: [[-122.388, 37.788], [-122.302, 37.825]] Question 15: Now, repeat question 8 for G˜∆ and report the results. Do you see any significant changes?

$25.00 View

[SOLVED] Cpsc 427 problem set 1 to 8 solutions

1 Assignment Goals 1. Learn how to prepare and build C++ code on the Zoo. 2. Learn how to customize and use tools.cpp and tools.hpp in your own code. 3. Learn good practices for labeling every code file with appropriate date, authorship, and acknowledgments of any code fragments taken from elsewhere. 4. Learn conventions used in this course regarding the form of the main program, use of banner() and bye() from the tools library, compiler switches, include guards, and so forth. 5. Learn simple uses of C++ strings. 6. Learn how to do simple C++ I/O with validity check. 7. Learn how to use C library time functions from within C++. 8. Learn how to report a fatal error by calling the Fatal() function in tools. 9. Learn how to test and submit your code on the Zoo. 2 Problem You are to write a program called aboutme that, when run, prompts the user to enter their first name, last name, and year of birth. The program then prints out the first name, last name, and age, which we take to be the difference between the current year and the year of birth. Here’s what a sample run for me might look like: ————————————————————— Michael J. Fischer CPSC 427/527 Sun Sep 2 2018 21:24:11 ————————————————————— Please enter your first name: Michael Please enter your last name: Fischer Please enter the year of your birth: 1942 Michael Fischer becomes 76 years old in 2018. ————————————————————— Normal termination. The lines before the first prompt are printed by banner(). The lines after the age line are printed by bye(). 2 Problem Set 1 3 Programming Notes 1. Copy the tools files from /c/cs427/code/tools/ into your working directory. Customize them by editing in your own name in place of the generic “Ima Goetting Closeau”. Be sure to submit your tools files along with your code and other required files. Read Chapter 1 of the “Exploring C++” textbook for more information about tools, but be warned that my version of tools differs somewhat from what is in the textbook. 2. You should create a file main.cpp with a main function exactly as follows: int main() { banner(); run(); bye(); } Your own code will go into main.cpp. Needed declarations should be placed before the function main(). You will need to define the function run(), which should be placed after the function main. 1 3. You must use the standard C++ iostreams facility for your I/O. To use it, one must #include the header file iostream. tools.hpp already does this for you, so all you need in main.cpp is the statement #include “tools.hpp”. The standard input stream is cin; the standard output stream is cout. The input operator is >>; the output operator is with standand streams cout and cin. 9. 1 Program uses good() to check for input errors and takes appropriate action in case of error. 10. 2 Program correctly computes the current year using time() and localtime() functions. 11. 1 A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. 12. 1 Running make results in an executable file aboutme and generates no errors or warnings. 13. 4 A file named aboutme.out contains the output from at least two test runs of the program, one with correct input and one where a non-number was entered instead of a valid year. The inputs typed for each test run should also be included so the test output can be replicated. 14. 2 All required files are submitted on Canvas as described in lecture 2. 20 Total points. Figure 1: Grading rubric.This short assignment is designed to deepen your understanding of C++ I/O and of character representations. 1 Assignment Goals 1. Learn how to use command line arguments. 2. Learn how to open a file and read its contents. 3. Learn how characters are represented by bytes in the computer. 4. Learn the difference between a character and its ASCII code. 5. Learn how to obtain the ASCII code of a character stored in a variable of type char. 6. Learn how to print the character whose ASCII code is stored in a variable of type int. 7. Learn how to print an int as a decimal number. 8. Learn how to print an int as a hex number. 9. Learn how to test if a char is printable. 10. Learn how to use the output manipulators dec, hex, setw(), and setfill() to control the printed form of numbers. 11. Learn precisely what in>>val does to the istream in when val has type int. 12. Learn how to use in.get(ch) to read a single character from in. 13. Learn precisely what outx. If a number is successfully read into x, then x should be printed in decimal on a line by itself. If the attempt to read x fails, then the next character should be read from the stream using in.get(ch), where ch has type char, and a one-line “Skipping. . . ” message should be printed. Depending on the character read, the message might look like either of the following: Skipping char: 116 0x74 ’t’ Skipping char: 0 0x00 2 Problem Set 2 In each case, the ASCII code of ch is printed first in decimal, right-justified in a 3-character field without zero-fill, and then again in hex, prefixed by “0x”, followed by a right-justified 0-filled hex number in a 2-character field. If ch is printable as defined by isprint()1 , then it should also be printed as a character, enclosed in single quotes as shown. For example, if file data.in contains the text: Score was 35to21. the output should be: ————————————————————— Ima Goetting Closeau CPSC 427/527 Tue Oct 4 2016 11:18:06 ————————————————————— Skipping char: 83 0x53 ’S’ Skipping char: 99 0x63 ’c’ Skipping char: 111 0x6f ’o’ Skipping char: 114 0x72 ’r’ Skipping char: 101 0x65 ’e’ Skipping char: 119 0x77 ’w’ Skipping char: 97 0x61 ’a’ Skipping char: 115 0x73 ’s’ 35 Skipping char: 116 0x74 ’t’ Skipping char: 111 0x6f ’o’ 21 Skipping char: 46 0x2e ’.’ Loop exit ————————————————————— Normal termination. Be sure you understand why there is no “Skipping” line for the spaces following “Score” and “was”. What happened to those characters? 3 Programming Notes This program is very short and may be put entirely in the run() function in main.cpp. You must read x using the stream extract operator >>. You may not use stringstream or getline() or other methods to read the line as a string or to read individual digits that comprise a decimal number. You must let the stream do your decimal to binary conversion. Do not call atoi() or strtol() or any other means of manually converting a string to an int. To obtain the ASCII code of a character stored in a char variable ch, cast ch to an int. Similarly, to print a character whose ASCII code is stored in an int variable x, cast x to a char before printing. 1 See https://www.cplusplus.com/reference/cctype/isprint/. Grading Rubric Your assignment will be graded according to the scale given in Figure 1 (see below). # Pts. Item 1. 1 All relevant standards from PS1 are followed regarding submission, identification of authorship on all files, and so forth. 2. 1 A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. 3. 1 Running make successfully compiles and links the project and results in an executable file readint. 4. 1 Your program gives a usage comment and terminates if the wrong number of command line arguments are given. It gives a descriptive error comment if the specified input file does not open. 5. 4 All instructions given in sections 2 and 3 are carefully followed. 6. 4 Your program correctly extracts all of the integers in the file. 7. 4 Your program prints a correct “Skipping. . . ” message following each failed attempt to read an integer. 8. 2 The “Skipping. . . ” message exactly follows the examples and instructions, including spacing and when to print leading 0’s and when not to. 9. 2 Your program correctly handles end-of-file, regardless of whether the EOF is immediately preceded by whitespace, a digit, or another character. 20 Total points. Figure 1: Grading rubric.1 Assignment Goals 1. Produce an application with more than one class, appropriately split into multiple files. 2. Learn how to use a constructor to produce a semantically valid non-trivial data structure. 3. Learn how to use classes and objects to model a physical structure. 4. Learn how to driver program to exercise and test a class. 5. Learn how to code within a prescribed and restricted subset of the language. 2 Think-A-Dot 2.1 Some history Think-a-Dot is a mathematical toy introduced by E.S.R. Inc. in the 1960’s. https://www.jaapsch.net/puzzles/images/thinkadot.jpg It is covered by U.S. patent 3,388,483, issued June 18, 1968 to Joseph A. Weisbecker. (See Fig. 1.) 2 Problem Set 3 Figure 1: U.S. patent 3,388,483, issued June 18, 1968 to Joseph A. Weisbecker. Figure 2: Looking inside a slightly-modified box. Handout #5—September 26, 2018 3 Ask some questions: 1. What is the structure of the machine? (See Figure 3.) 2. Starting from the all-yellow pattern, can one drop in marbles so as to make it all blue? 3. If so, can one get back to the all-yellow pattern? 4. How many of the 2 8 = 256 possible patterns can one reach from the initial state (all-yellow)? 5. Given that pattern s2 is reachable from pattern s1, how many marbles are needed? (Call this the directed distance from s1 to s2.) 6. Is the distance from s1 to s2 always the same as the distance from s2 to s1? 7. What is the largest distance between any pair of states for which the distance is defined? Figure 3: Structure of the machine. Binary Counter Eight balls through hole C will cause gates 7–5–3 to behave like a binary counter and cycle through all eight possibilities. Gates to the above and left (1, 2, 4, 6) are not affected. 4 Problem Set 3 How can you get to a particular pattern? Starting from all yellow, how can one reach this goal? Here’s one solution: −→ −→ −→ Handout #5—September 26, 2018 5 2.2 Further references Google returns many hits for the search term “think-a-dot”. 1. Some of the early pre-E.S.R. Think-a-Dot history. 2. A realistic Think-a-Dot simulator that you can play with, written Scratch. This shows the original unmodified dot pattern that appears when the device is tipped to the right. 3. A Think-a-Dot-inspired electronic game from 2002. 4. Some of the mathematical theory behind Think-a-Dot (from Wikipedia, Think-a-Dot). (a) Schwartz, Benjamin L. (1967), “Mathematical theory of Think-a-Dot”, Mathematics Magazine, 40 (4): 187193, doi:10.2307/2688674, MR 1571696. (b) Beidler, John A. (1973), “Think-a-Dot revisited”, Mathematics Magazine, 46: 128136, doi:10.2307/2687967, MR 0379077. (c) Gemignani, Michael (1979), “Think-a-Dot: a useful generalization”, Mathematics Magazine, 52 (2): 110112, doi:10.2307/2689850, MR 1572295. 3 Problem You are to model a Think-a-Dot device and its behavior through a collection of C++ classes. You are also to write a command tad that allows a user to interact with your simulated Think-a-Dot device. User inputs are single letters commands. All command letters are case insensitive, so ‘Q’ and ‘q’ for example have the same effect. The commands are: • ‘A’, ‘B’, ‘C’ simulate the action of the machine when a ball is dropped in hole ‘A’, ‘B’, or ‘C’, respectively. • ‘L’, ‘R’ cause the gates to be reset to all point the same way – all to the left or all to the right, respectively. • The flip-flops should be colored as shown for the modified box in Figure 2. • ‘P’ prints the state of the machine using three lines of text, e.g., R L R L L L L L • ‘H’ prints a brief version of these instructions. • ‘Q’ exits the program. Your program will prompt the user to enter a command letter, check it for validity, and print the hole at which the ball exits the machine (hole ‘P’ or ‘Q’ as shown in Fig. 3). 4 Programming Notes You will define and implement three classes: ThinkADot, FlipFlop, and Game. Class ThinkADot models the Think-A-Dot device. FlipFlop models a single flip-flop within the Think-a-Dot. Game controls the user-interaction with the Think-A-Dot. It prompts the user to get command letters (from cin) and to print results (to cout). It interacts with the Think-A-Dot to determine how the device responds to the various operations that can be performed on it. 6 Problem Set 3 Class Game should have a public function play() that starts the game. play() first creates a ThinkADot object where the flip-flops are colored as shown in Figure 2. It then enters the interactive loop that prompts the user for a command letter and performs the corresponding action. Class FlipFlop models a single flip-flop. The state of a flip-flop is either “left” or “right” and should be represented by an enum type. (See 08-brackets Token class for an example.) There should be a print() function that just prints a single letter ‘L’ or ‘R’ according to the current state of the flip-flop. There should also be a function flip() that flips the state from “left” to “right” or vice versa and returns the side of the flip-flop (“left” or “right”) the ball is on when leaving the flip-flop. Thus, if the flip-flop is in the “left”-leaning state initially, the ball will pass to the right, and the new state will be “right”. For this class, it is okay to have public functions getState() and setState() to be used by member functions of class ThinkADot. A superior design would nest the entire FlipFlop class inside of the ThinkADot class, but for this assignment, FlipFlop should be a separate class at the same level as the others. (We will get to nested classes later in the course.) Class ThinkADot models the device. It has a private array (not a vector) of eight FlipFlop objects that store the current state of each of the eight flip-flops. Its constructor should initialize all of the flip-flops to the “left” position, the same as the ‘L’ command. It has public functions reset(), play(), and print() that carry out the actions ‘A’, ‘B’, ‘C’, ‘L’, and ‘R’ (with appropriate parameters) that can be performed on the device. The flip-flop states must be accessible only from these required member functions. In particular, there should be no getter or setter functions for the flip-flops. The file main.cpp will have the same form as in PS1. The global function run() should only have two lines – one to instantiate Game and the other to call the Game object’s play() function. 4.1 Computing the next state The tricky part of this assignment is how to update the state when a ball is dropped through one of the three holes ‘A’, ‘B’, or ‘C’. Referring to Figure 2, you can see seven channels through which the ball can pass. If we number them from 0 through 6, starting at the left, then we see that a ball dropped in hole ‘A’ enters channel 1. After passing through the first flip-flop it moves to either channel 0 or to channel 2, depending on the state of the flip-flop. If it goes to channel 0, it drops straight through to the bottom and comes out on the left side. If it goes to channel 2, it encounters the first flip-flop in the second row. After passing through it, the ball enters channel 1 or 3, etc. Your code should trace the path of a ball through the machine as described above, flipping each of the flip-flops encountered on the way, and recording the last channel the ball was in. Clearly that will be channel 1, 3, 5, or 7. If it’s 1 or 3, the ball exits the machine through hold ‘P’. Otherwise, it exits through hole ‘Q’. Note that it would be ambiguous where the ball exits if it were coming from channel 4, but that is not possible. 4.2 No-no’s There are many ways to implement a Think-a-Dot. For this assignment, you must do it as described above. Here are a few no-no’s, not because they’re necessarily wrong but because I want you to learn the particular techniques described above. 1. Don’t use new or delete. 2. Don’t use any Standard Library container functions such as vector. 3. Don’t use a table lookup to find the next state. Handout #5—September 26, 2018 7 4. Don’t use nested classes. 5. Don’t use language features that have not been presented in lecture or in any of the class examples. Don’t use prohibited features such as non-const global variables or goto’s. If you think you need to violate any of these restricts, please ask me for help. 5 Grading Rubric Your assignment will be graded according to the scale given in Figure 4 (see below). # Pts. Item 1. 1 All relevant standards from PS1 are followed regarding submission, identification of authorship on all files, and so forth. 2. 1 A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. 3. 1 Running make successfully compiles and links the project and results in an executable file tad. 4. 1 tad smoothly interacts with the user. Clean, easily understood user prompts and help messages are given. 5. 2 Bad user inputs are handled gracefully and do not result in fatal errors. 6. 6 All of the functionality in section 3 is correctly implemented. In particular, each of the eight command letters works properly in both upper and lower case and carries out its assigned action correctly. 7. 3 The structure of the program matches the specification and restrictions given in in section 4. No dynamic storage is used. 8. 1 Each function definition is preceded by a comment that describes clearly what it does. 9. 4 The program shows good style. All functions are clean and concise. Inline initializations, functions, and const are used where appropriate. Variable names are appropriate to the context. Programs are consistently indented according to the course indenting style. Each class has a separate .hpp file and, if needed, a separate .cpp file. 20 Total points. Figure 4: Grading rubric.1 Consensus Problem In this and following assignments, we will be developing a simulator for a distributed consensus algorithm. Consensus is at the heart of maintaining consistency in distributed databases as well as in cryptocurrencies and blockchain algorithms. We consider the consensus problem in a simple setting. The players are trying to reach agreement on a course of action. Each player has a current preference called her choice, which is the current value stored in her choice register. We assume a simple binary choice, so the choice value is either 0 or 1. The players communicate with each other, and from time to time a player may change her choice. The goal is for the players to arrive at a stage where all players are making the same choice. In this case, we say the players have reached consensus, and we call the common choice the consensus value. We also require that the consensus value be stable, meaning that once consensus has been reached, nobody can subsequently ever change her choice. We assume the agents communicate using the random-pair communication model. In this model, a communication round consists of a randomly chosen player (called the sender ) sending a message to another randomly chosen player (called the receiver ). Sender and receiver must be distinct. For the algorithms considered here, the sender’s message is always her current choice value. The receiver, depending on the message received, may change her current choice and her internal state. A population of players solves the consensus problem if following is true: 1. For all possible initial choices of the players, if the players start in their designated initial states, the computation eventually reaches consensus with probability 1. 2. Once consensus has been reached, no player can subsequently ever change her choice. Note that if all players start with the same choice, then that choice is the consensus value, and no player ever changes her choice. There are many possible algorithms for reaching consensus. Here are a couple very simple ones that we will be exploring. 1.1 Fickle Whenever a fickle player receives a message, she changes her choice, if necessary, to agree with the sender’s choice. That is, she sets her choice register to the value contained in the message. It is easy to see that there is some sequence of message transmissions that causes the system to reach consensus. Once consensus has been reached, no player will change her choice since every subsequent message contains the consensus value. It is also easy to believe that it might take a very large number of random communication rounds to reach consensus. 2 Problem Set 4 1.2 Follow the Crowd A follow-the-crowd player has a one-bit state register in which she saves the last message received. She changes her choice only when she gets two messages in a row that both disagree with her current choice. Thus, she waits until she gets a sense of the crowd before deciding to follow. We assume that each player starts with her state register set equal to her choice. In greater detail, when a follow-the-crowd player receives a message m, she compares it with her current state. If it differs, she replaces the current state with m. If it is the same, she replaces the current choice with m. It is believable that this might converge to a consensus value faster than fickle since it is less likely for a player holding the majority choice to change to the minority value. 2 Assignment Goals 1. To learn how to organize a simulation of a large system. 2. To learn about a simple model of asynchronous distributed computing. 3. To learn how to generate uniformly distributed random numbers from a finite interval. 4. To experience a computationally-intensive application where efficiency matters. 3 Problem In this assignment, you will implement a simulation of a large number of agents attempting to reach consensus using the fickle algorithm under the random-pair communication model. The follow-the-crowd algorithm will be used in a later assignment. You are required to implement two classes and a main program. • class Agent models an agent running the fickle algorithm. The public interface must support these functions: – Agent(int ch) constructs an agent with choice ch. – void update(int m) performs the update to the agent as specified by algorithm fickle upon receipt of the message m. – int choice() const returns the agent’s current choice. • class Simulator simulates a collection of n agents trying to reach consensus using the random communication model described in section 1. Its public interface consists of the following: – Simulator( int numAgents, int numOne, unsigned int seed ) constructs a simulator for numAgents agents. The first numOne of these have initial choice 1; the remainder have initial choice 0. seed is used to initialize the random number generator random(). – int run( int& rounds ) runs the simulation for as many rounds as it takes to reach consensus. The number of communication rounds used is stored in the output parameter rounds. The consensus value is returned. Handout #6—October 22, 2018 3 To carry out the simulation requires the ability to select a random pair of distinct agents j and k to serve as sender and receiver in a communication round. Further details are given in section 4. To know when consensus is reached requires the simulator to keep track of the number of agents having each of the two possible choice values. Since the only way an agent might change its choice is because of update(), you should just update the counts of agents having a given choice after each communication round. Do not poll every agent after every round. • main.cpp implements a command > consensus numAgents numOne [seed] that takes two required argumets, numAgents and numOne, and one optional argument, seed. These three arguments should be converted to numbers and passed to the Simulator constructor. If seed is omitted, the result of time(0) should be used instead. The run() function in main.cpp should instatiate a Simulator with the given parameters and then run it. When Simulator::run() returns, you should print a single line to cout consisting of five whitespace-separated numbers: The number of agents, the number of agents initially choosing one, the actual seed used, the number of communication rounds required to reach consensus, and the final consensus value. You should test your code using various combinations of parameters. Increasing the population size numPlayers will cause a big increase in run time as will having numOne be close to numPlayers/2. You may terminate your experiments once the run time grows to more than a few seconds. This may come rather quickly with fickle, but that is for you to find out. As usual, you should submit test input files and the corresponding outputs produce by your program. 4 Program Notes I will furnish some test cases on the Zoo in /c/cs427/code/ps4/, and you should test it with some parameter combinations of your own. However, you might only be able to duplicate my output if you run your code on the Zoo and your program uses the random number generator in the same way, namely, at each round, first select the sender and then select the receiver. To select a random sender from among n agents, you can use the function RandomUniform(n), which returns a uniformly-distributed random integer in the range [0 . . . n − 1]. int RandomUniform( int n ) { long int usefulMax = RAND_MAX – (RAND_MAX+1)%n; long int r; do { r = random(); } while ( r > usefulMax ); return r % n; } 4 Problem Set 4 The purpose of this code is to make all numbers in the given range equally likely.1 To make sure you use the same number of calls on the random number generator as I do when choosing the sender and receiver, you should first chose the sender j from among the n agents. Now there are only n − 1 eligible receivers, so you should choose a a number k in the range [0 . . . n − 2] and adjust k to avoid j by incrementing k if k ≥ j. The submission guidelines are the same as in previous assignments. Submit all files needed to compile your project along with a Makefile. Include a notes.txt file, a file of sample inputs and a file of the corresponding outputs. 5 Grading Rubric Your assignment will be graded according to the scale given in Figure 1 (see below). # Pts. Item 1. 4 All relevant standards from previous problem sets are followed regarding submission, identification of authorship on all files, and so forth. A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. Running make successfully compiles and links the project and results in an executable file consensus. Each function definition is preceded by a comment that describes clearly what it does. 2. 2 Required sample input and output files are submitted. 3. 4 The program shows good style. All functions are clean and concise. Inline initializations, inline functions, and const are used where appropriate. Variable names are appropriate to the context. Programs are consistently indented according to the course indenting style. Each class has a separate .hpp file and, if needed, a separate .cpp file. 4. 2 Everything is private in all classes except for the specified public interface and any needed special functions (constructors, destructor, move and copy constructors and assignments). 5. 8 All of the functionality in section 3 is correctly implemented. 20 Total points. Figure 1: Grading rubric. 1Note that it is not sufficient to just take random()%n since if n does not divide RAND_MAX + 1, some numbers will have a greater probability of being chosen than others. For example, ifrandom() were to produce numbers in the range [0 . . . 9] and we wanted numbers in the range [0 . . . 3], then reducing each of the numbers in the range [0 . . . , 9] mod 4 gives the sequence 0, 1, 2, 3, 0, 1, 2, 3, 0, 1. We see that 0 and 1 each occur 3 times, whereas 2 and 3 each occur only twice. Thus, 0 and 1 are each generated with probability 0.3 and 2 and 3 are each generated with probability only 0.2. To be uniformly distributed, all probabilities should be 0.25. In this example, where we pretend RAND_MAX==9 and n = 4, my code computes usefulMax to be 9 – 10%4 = 7. Then whenever random() returns 8 or 9, the program loops and tries again.1 Assignment Objective This problem set continues the development begun in Problem Set 4 of a simulator for a population of simple agents attempting to reach consensus on a choice value. The PS4 assignment handout describes two different such agent algorithms: fickle and follow the crowd. In PS4, you implemented a simulator for a collection of fickle agents. In this assignment, you will add a number of new features to the PS4 consensus program. Just as in real-life, refactoring code to handle new requirements is easy in places and harder in others. 2 Assignment Goals • Learn to use polymorphism. • Experience the effect of refactoring a big class (Simulator) into two related but smaller classes. • Learn how to explore a rich parameter space, and gain insight into the behavior of random processes. 3 Problem Here is an overview of the new features and required changes: 1. Agent will become a pure abstract class. Recall that that means all functions are virtual, and all are abstract except for the virtual destructor. The public Agent functions are the same as before: update() and choice(). 2. Fickle is a new class publicly derived from Agent. It is pretty much the same as the Agent class from PS4. Crowd is another new class publicly derived from Agent. It implements the follow-the-crowd algorithm described in the PS4 assignment handout. 3. The Simulator of PS4 did two distinct jobs: (a) It set up the population of agents to be simulated. (b) It ran the simulation. In this assignment, you will separate these two tasks. A new class Population will create and manage the agents. The revised Simulator will take a Population reference as a parameter and run the simulation, as before, until consensus is reached. 4. Population will maintain the aggregated array of Agent that was previously in Simulation. However, since Agent is now the base class for two different derived classes, the array elements will be Agent pointers. Each agent will be created using either new Fickle( val ) or new Crowd( val ), depending on which kind of agent 2 Problem Set 5 is desired. Here val is the initial choice value for that agent. Population will retain custody of all of these agents, so its destructor must take care to delete them all. 5. The method for constructing agents is different from PS4. Each agent is randomly assigned to one of the two concrete agent types, Fickle or Crowd. The initial value v is randomly chosen from the set {0, 1}. Both of these random choices are biased according to new command line parameters. probFickle is a real number in the semi-open interval [0, 1) and specifies the probability that an agent is chosen to be of type Fickle rather than of type Crowd. probOne similarly is a real number in the range [0, 1) and specifies the probability that an agent’s initial choice is 1 rather than 0. 6. Population has several public functions in addition to constructors, destructor, and print: (a) int size() const returns the number of agents. (b) void sendMessage(int sender, int receiver) simulates a single communication step from sender to receiver. (c) bool consensusReached() returns true iff consensus has been reached. (d) int consensusValue() returns the consensus value if consensus has been reached; otherwise it returns -1. 7. The Simulator constructor now only takes a single parameter of type Population&. Its run function has signature void run(). To obtain the results of the simulation, the caller can call two new public functions: numRounds() and consensusValue(). Since the simulator is doing the simulation, it knows how many rounds it has used. On the other hand, only Population knows the consensus value, so this is a case where delegation should be used. 8. main.cpp changes considerably. It takes different command line arguments and it prints different output than PS4. (a) The new command line arguments are numAgents probFickle probOne [seed] where seed is optional as before. numAgents is again the total number of agents. probFickle and probCrowd are the probabilities discussed in item 5 above. (b) All output should go to cout. banner() and bye() should be used as usual. The output from a run has three parts: The initial parameters, the statistics of the population after the random generation process, and the results of the simulation. See sample.out for an example of the new output format. 9. The resulting executable file should be called consensus2 to distinguish it from the PS4 command name. An important part of this assignment is to test your program on reasonable test cases, to submit the test case inputs and corresponding outputs, and to report on what you observe. For example, you should run your code on extreme cases such as 0 agents, 1 agent, probabilities of 0.0 and 1.0, and so forth. Try to gain some insight about the extent to which follow-the-crowd agents do better than fickle agents. For example, what do you observe with a modest size population (say 1000) with different values for probFickle, say, 0.0, 0.01, 0.5, 0.99, 1.0. Handout #7—October 31, 2018 3 4 Programming Notes 1. random() is now used in two different parts of the code. (a) Simulator::run() uses uRandom() as before in order to choose first a sender and then a receiver for a communication step. uRandom() of course is based on random(). (b) The Population constructor needs random values when constructing each agent. First it uses randomness to choose an initial choice value for the agent. Then it uses it to choose the agent type. In both cases, it chooses a double in the semi-open interval [0, 1) and compares that number with the desired probability in order to make its decision. For example, to generate the choice value, test if the random number is less than the desired probability of choosing 1. If it is, then choose 1, else choose 0. 2. To choose a random real in [0, 1), you can use my code double Population:: dRandom() { return random()/(RAND_MAX+1.0); // result is double in [0,1) } By default, the type of “1.0” is double, so the coercion rules force the addition and then the division to both be performed using double arithmetic. If you change “1.0” to “1”, it won’t compile without warnings, and it won’t work correctly. 3. In order to duplicate my output, you will need to use the random number generator in exactly the same way as my program does. In particular, you will need to choose the sender before the receiver, and you will need to choose the initial consensus value for an agent before choosing the agent type. Of course, you will also need to start with the same seed. 4. The submission guidelines are the same as in previous assignments. Submit all files needed to compile your project along with a Makefile. Include a notes.txt file, a file of sample inputs and a file of the corresponding outputs. 5. Note that the grading rubric for this assignment puts more emphasis on good design, good style and good choice of test data than the previous assignments. 4 Problem Set 5 5 Grading Rubric Your assignment will be graded according to the scale given in Figure 1 (see below). # Pts. Item 1. 4 All relevant standards from previous problem sets are followed regarding submission, identification of authorship on all files, and so forth. A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. Running make successfully compiles and links the project and results in an executable file consensus2. Each function definition is preceded by a comment that describes clearly what it does. 2. 3 Sample input and output files are submitted that show good coverage of the parameter space, e.g., small inputs, large inputs, edge cases for the probabilities (e.g., 0.0 and 1.0) as well as reasonable intermediate cases. 3. 5 The program shows good style. All functions are clean and concise. Inline initializations, inline functions, and const are used where appropriate. Variable names are appropriate to the context. Programs are consistently indented according to the course indenting style. Each class has a separate .hpp file and, if needed, a separate .cpp file. However, it is acceptable to group the three polymorphic agent classes together in the same .hpp and .cpp files. 4. 2 Everything is private in all classes except for the specified public interface, any needed special functions (constructors, destructor, move and copy constructors and assignments), and functions needed for debugging. 5. 6 All of the functionality in section 3 is correctly implemented. 20 Total points. Figure 1: Grading rubric.1 Introduction This problem set continues the development begun in Problem Set 4 of a simulator for a population of simple agents attempting to reach consensus on a choice value. The long-rang goal of this and following assignment(s) is to simulate the blockchain consensus algorithm used in Bitcoin cryptocurrency, sometimes called Nakamoto consensus, to see how fast consensus is reached under various assumptions about the speed and reliability of the underlying network and the honesty of the agents participating in the protocol. 2 Blockchain Background While not strictly needed for this assignment, a general understanding of blockchains and Nakamoto consensus will help motivate it. A blockchain is a sequence of records or blocks that are cryptographically protected to preserve integrity and prevent various kinds of tampering. A blockchain can be extended only by someone who knows the solution to a difficult cryptographic puzzle that is derived from the current blockchain. An agent, called a miner, who wants to extend the blockchain first has to solve the puzzle for that chain. In general, many miners are working in parallel to solve the puzzle. Any one who succeeds is able to create a new block of transactions and append it to the end of the chain. The result is a new longer chain that is verifiably valid. Once a new chain has been produced, the successful miner sends it around the network to other miners. When a longer (valid) chain is received from another miner, the recipient discards the old shorter chain and begins trying to solve the puzzle for the longer chain. Consensus on the new chain is reached when all of the miners have received the new chain and discarded their old, shorter, chains. It is possible that two miners will solve the puzzle at nearly the same time and will each propose longer but different extensions of the current chain. These new chains will propagate around the network. Because they are equal length, neither will annihilate the other, so consensus cannot be reached until a yet longer chain is produced by one of the minors. As this new chain propagates through the network, miners will discard their old chains and adopt the new one. Intuitively, consensus will eventually be reached if there is a unique longest chain in circulation, and no new chains are created before consensus is reached. The purpose of the puzzles (also called proof of work is to slow down the process of creating new chains, which in turn will decrease the likelihood of new chains interfering with the consensus process. In practice, one does not require that consensus on what the current blockchain is. Rather, one is interested when consensus is reached on a particular block. For example, suppose a miner extends a length 10 blockchain c to create a new blockchain whose last (11th) block is b. Intuitively, b is committed if b is the 11th block of every longer blockchain 2 Problem Set 6 still in circulation. Of course, there is always the possibility that some miner still working on the old chain c will succeed in creating a new length-11 chain with a different 11th block. However, the chance of that block not being annihilated as it attempts to infect other miners is vanishing small under suitable circumstances. 3 Problem This assignment focuses on a particular space-efficient representation of blockchains. At an abstract level, a blockchain is just a sequence of blocks. A new blockchain is created by appending a new block to the end of an existing blockchain. The initial blockchain consists of a single genesis block. All subsequent blockchains begin with the genesis block. We ignore the cryptography issues of real blockchains and assume that the only properties of interest in a blockchain are its length and the list of blocks that comprise the chain. Each block has an associated unique identifier, so there is never the possibility of two identical blocks being created in different parts of the system. Figure 1 shows a situation with three active blockchains beginning with blocks Bk2, Bk3, and Bk4, respectively. Four agents ChA, . . . , ChD have three different current choices for the blockchain. ChA prefers the chain beginning with Bk2, ChB and ChC both prefer the chain at Bk4, and ChD prefers the chain at Bk3. ChA ChB Bk3 Bk2 Bk1 Bk4 ChC ChD Figure 1: Three active blockchains. Our blockchain representation has two goals: 1. No block is ever copied. 2. A block is automatically deleted as soon as it becomes inaccessible. For goal 1, we delete the copy constructor and copy assignment (to prevent accidental copying), and we use pointers to represent the chain structure as a kind of linked list. For goal 2, we replace the arrows of Figure 1 with a slight modification of the SPtr class presented in demo 21a-SmartPointer-v2. Figure 2 shows the smart pointers as white rectangles inside both the blockchain headers (possessed by the agents) and inside the blocks Handout #8—November 26, 2018 3 themselves. The dashed white boxes represent the count dynamic extensions in the SPtr class. Recall that this is a count of the number of SPtr objects having the same target pointer. ChA ChB ChC ChD Bk1 Bk2 Bk4 2 Bk3 2 1 2 Figure 2: Three active blockchains. Your job is to implement two new classes, Blockchain and Block, to represent blockchains. In addition to the SPtr data member, each Block will also have two const fields, a unique identifier and it’s level in the blockchain tree, where the genesis block is considered to be at level 0. (The genesis block in Figures 1 and 2 is Bk1.) Class Blockchain contains a single private data member of type SPtr, which implements the smart pointer to the most recent block in the chain. It should have several public functions: 1. Blockchain extend() returns a new blockchain created by extending the current blockchain. The new chain should be stack-allocated and returned by value. 2. print() prints the blocks that comprise a blockchain in order of increasing level. For example, the output from printing the blockchain ChC in Figure 2 might look like [0,1] [1,2] [2,4]. Here, the first number of each pair is the block’s level in the tree and the second number is its UID. 3. operator operator behaves on ordinary C pointers. You might also want to define a function to return the C pointer target that the SPtr object manages. Handout #8—November 26, 2018 5 6 Grading Rubric Your assignment will be graded according to the scale given in Figure 3 (see below). # Pts. Item 1. 4 All relevant standards from previous problem sets are followed regarding submission, identification of authorship on all files, and so forth. A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. Running make successfully compiles and links the project and results in an executable file blockchain. Each function definition is preceded by a comment that describes clearly what it does. 2. 4 Sample input and output files are submitted that show the program behaves as expected. In particular, you should create a test file that grows a blockchain structure like the one in Figure 1 and then proceeds to replace some of the blockchains with others. An inaccessible block should be deleted automatically when the last smart pointer to it is deallocated. It may be useful to leave the SPtr debugging printout in place that shows when a block is deleted. 3. 4 The program shows good style. All functions are clean and concise. Inline initializations, inline functions, and const are used where appropriate. Variable names are appropriate to the context. Programs are consistently indented according to the course indenting style. Each class has a separate .hpp file and, if needed, a separate .cpp file. 4. 4 All of the functionality in section 3 is correctly implemented. 5. 4 Valgrind gives clean output with all storage blocks freed except for the instantiation of Serial. 20 Total points. Figure 3: Grading rubric.1 Introduction This problem set continues the development begun in Problem Set 4 of a simulator for a population of simple agents attempting to reach consensus on a choice value. The PS4 assignment handout describes two different such agent algorithms: fickle and follow the crowd. In PS4, you implemented a simulator for a collection of fickle agents. In the PS5 assignment, you refactored the PS4 solution to add a number of new features. In particular, you made Agent into a pure abstract class, you split off a new Population class from the Simulator class, and you changed the method used for building the initial population from the command line parameters. In the PS6 assignment, you implemented new classes Block and Blockchain, and you imported and perhaps modified demo classes SPtr and Serial. The goal of this assignment is to simulate the blockchain consensus algorithm used in Bitcoin cryptocurrency, sometimes called Nakamoto consensus, to see how consensus gradually evolves. 2 Teaching Objectives • Gain more experience in refactoring code to adapt it to new requirements. • Learn how to factor out common code in a polymorphic class hierarchy. • Find additional uses of delegation in order to keep functions close to the data members that they need. • Learn how objects of class Blockchain can be passed around freely in the simulator without concern about storage management issues. • Take a step closer to simulating a realistic blockchain consensus algorithm. 3 Problem Integrate and extend your PS5 and PS6 solutions to result in a simulator for agents that are attempting to agree on a blockchain rather than on a single bit. In particular, you will need to do the following: 1. Find everywhere in your code from PS5 involved with reaching consensus on an a 0-1 value of type int. Change the value type instead to Blockchain. In particular, the abstract virtual function choice() in Agent should now return a value of type Blockchain rather than of type int. 2 Problem Set 7 2. Add a new abstract virtual function extend() to class Agent() that causes the agent to extend its current blockchain and to make the extended blockchain its new choice. This will apply to all agent classes, including Fickle and Crowd. 1 3. Add another agent class, Nakamoto, to model the Nakamoto algorithm’s rule of ignoring a new blockchain received from another agent unless the new chain is longer than the current one, in which case the longer blockchain replaces the shorter one. 4. Add a new class AgentBase that is publicly derived from Agent and from which the actual agents Fickle, Crowd, and Nakamoto will be publicly derived. The purpose of AgentBase is to give a place for the data members and functions that are common to all agents such as the agent’s current choice and the public functions choice() and extend(). The reason for not putting these things directly in Agent is that Agent is a pure abstract class, so it cannot be instantiatied and therefore also cannot have data members. 5. Change Simulator to randomly decide at each step whether to perform an update() operation or an extend() operation. It does this by calling dRandom() and comparing the result to a new command line argument probExtend. In case it decides to simulate an extend, it chooses an agent at random to perform the extend. Similarly, if it decides to simulate the sending of a message, it works as in PS4 and PS5 by choosing a random sender and a distinct random receiver for the simulated sending of a message. 6. Instead of running until consensus is reached, the new simulator will run for maxRounds, which is a new command line argument that is passed to Simulator::run() as a parameter. Thus, there is no longer any need for the functions and data members that were formerly involved in trying to determine whether or not consensus had been reached, and if so, what the consensus value was. Rather, at the end of the simulation, we’ll simply print out a list of agents with each agent’s current choice. 7. The code in class Population that creates the population should now make a 3-way choice between Fickle, Crowd, and Nakamoto. New command line parameters provide the desired probabilities for each kind of agent. All agents, regardless of type, now all start with copies of the same initial genesis Blockchain for their initial choice. The genesis Blockchain contains a smart pointer that points to the genesis Block. The genesis Block is unique in that its SPtr data member has both target and count set to nullptr. 8. Population should also have functions extend(int receiver) and sendMessage(int senter, int receiver) to translate between the simulator’s use of integers to identify agents and the agents themselves, which actually carry out those operations. The agents in turn delegate some of the work to Blockchain functions that were defined in PS6. 9. Modify main.cpp to accept the following command line arguments, numAgents maxRounds probNak probFickle probExtend [seed] where the arguments have the following meanings: 1This is intended to model what happens when a Bitcoin miner successfully solves the current proof-ofwork puzzle and is therefore allowed to add a set of transactions to the current blockchain. Handout #9—December 3, 2018 3 numAgents The total number of agents (as before). maxRounds The total number of simulation rounds to perform. probNak The probability of selecting a Nakamoto agent when building the population. probFickle The probability of selecting a Fickle agent when building the population. probExtend The probability that the simulator chooses to simulate an extend() operation rather than a sendMessage() operation. [seed] Optional seed for the random number generator (as before). The probability of selecting a Crowd agent is 1.0 − probNak − probFickle. It is an error if the result does not lie in the closed interval [0.0, 1.0]. 10. Modify Population to have two print functions, one which prints the statistics as in PS5 (but I’m now calling it printStats()), and one that prints out each agent’s choice of blockchain, one per line (which is what I’m now calling print()). Naturally, print() delegates the printing of an agent’s current choice to Blockchain::print(). A sample call on your simulator is given on the Zoo in /c/cs427/code/ps7/sample.sh, along with the output from a run on my machine. To give a better idea of what the simulator is doing, I have added print statements to the simulator to show what operation is being performed at each step of the simulation. I’ve also added a print statement to Population::extend() to print each new blockchain when it is first produced. 4 Programming Notes 1. There should be no public data members and no use of friend classes or functions. 2. Dynamic memory (allocated by new) should only be used in the following places: (a) Within the furnished classes SPtr and Serial; (b) For creating objects of type Block; (c) For creating objects of type Agent and for creating the array of agents. All Blockchain objects should be stack allocated. 3. The only delete statements outside of class SPtr should be to delete objects created in case 2c above. You should not explicitly delete any blocks. They should be deleted automatically by the SPtr objects that manage them. 4 Problem Set 7 5 Grading Rubric Your assignment will be graded according to the scale given in Figure 1 (see below). # Pts. Item 1. 3 All relevant standards from previous problem sets are followed regarding submission, identification of authorship on all files, and so forth. A well-formed Makefile or makefile is submitted that specifies compiler options -O1 -g -Wall -std=c++17. Running make successfully compiles and links the project and results in an executable file blockchain. Each function definition is preceded by a comment that describes clearly what it does. 2. 2 Sample input and output files are submitted that show good coverage of the parameter space, e.g., small inputs, large inputs, edge cases for the probabilities (e.g., 0.0 and 1.0) as well as reasonable intermediate cases. This is in addition to the furnished sample file. 3. 3 The program shows good style. All functions are clean and concise. Inline initializations, inline functions, and const are used where appropriate. Variable names are appropriate to the context. Programs are consistently indented according to the course indenting style. Each class has a separate .hpp file and, if needed, a separate .cpp file. However, it is acceptable to group the three polymorphic agent classes together in the same .hpp and .cpp files. 4. 2 The restrictions in section 4 are all obeyed. 5. 10 All of the functionality in section 3 is correctly implemented. 20 Total points. Figure 1: Grading rubric.1 Introduction This 10-point assignment is required for graduate students and anyone else registered under CPSC 527. It is optional for students registered under CPSC 427. For those students, points earned on this assignment will offset points lost on previous homework assignments, but they will not apply against points lost on exams, nor will they permit anyone to earn more than 100% of the possible homework points. The due date for this assignment is the same as that for homework assignment 7. However, they are separate assignments and should be submitted separately. 2 Problem The goal of this assignment is to modify your solution to homework assignment 7 to gather and print additional information about the progress of the simulation. An inventory of blockchains is a pr´ecis of the current set of choices of all of the agents in the population. A sample inventory from a real simulation run with 10 agents is: Inventory: 5 active blockchain(s): 1 copies of [0,1] [1,46] [2,123] [3,255] 1 copies of [0,1] [1,46] [2,123] [3,271] 6 copies of [0,1] [1,46] [2,123] 1 copies of [0,1] [1,46] [2,160] 1 copies of [0,1] [1,46] [2,236] We see that all of the agents agree on the level-0 and level-1 blocks of each chain and eight agents agree on level-2 block [2,123]. However, two different chains have already been forked from it, so two agents have distinct level-3 blocks, and it is unclear which will eventually win out. Thus, an inventory is a compressed version of the current choices of the population that allows one to relatively quickly find those blocks for which consensus has already been achieved. However, finding consensus blocks is not a part of this assignment. All that is required is to create the initial inventory, to maintain it step by step during the simulation, and to print it when required (in the format shown above). 3 Programming Notes 1. You should create a class Inventory whose sole data member is a std::map. 2 Problem Set 8 2. To maintain the inventory, you should write functions add() and sub that add (insert) and subtract (remove) blockchains from the inventory, respectively. After each simulation step, if the agent’s new and old choices differ, the new one should be added and the old one subtracted from the inventory. 3. There should be no occurrences of new in PS8 other than those already present in PS7. In particular, the map should be composed in Inventory and not be a dynamic extension. Keeping the use of dynamic storage confined to classes that can manage it such as SPtr and std::map simplies your code and makes it less likely to have hidden errors. Remember the motto: “No new’s is good news!” 4. You should define Blockchain::operator

$25.00 View

[SOLVED] Csci596 assignment 1 to 7 solutions

0.1. Asymptotic complexity analysis Read a lecture note (https://cacs.usc.edu/education/cs596/AsymptoticAnalysis.pdf) and “Appendix A.2—Order Analysis of Functions” of Introduction to Parallel Computing” by Grama et al. (page 581 of the PDF file, to which the link is found at the course homepage, https://cacs.usc.edu/education/cs596.html, under “Textbooks” heading). 0.2. Theoretical peak performance of a computer Read a lecture note (https://cacs.usc.edu/education/cs596/PeakFlops.pdf) to learn how to compute the theoretical peak floating-point performance of a computer. Now, here is the actual assignment: Submit your answers to the following two questions. 1. Measuring Computational Complexity Use the data file, MDtime.out, in the assignment 1 package. In the two-column file, the left column is the number of atoms, N, simulated by the md.c program, whereas the right column is the corresponding running time, T, of the program in seconds. Make a log-log plot of T vs. N. Perform linear fit of logT vs. logN, i.e., logT = alogN +b, where a and b are fitting parameters. Note that the coefficient a signifies the power with which the runtime scales as a function of problem size N: T µ Na . For detail, see slide 31 in https://cacs.usc.edu/education/cs596/01MDVG.pdf). Submit (i) the plot and (ii) your fitted value of a. For this and subsequent assignments, you need to use a scientific plotting software like Grace, Origin, Kaleidagraph, Gnuplot, Matlab, Mathematica and Excel. Please make sure that you are familiar with one such software. For this assignment, you also need to use the least-square fitting feature of your plotting software. In case you cannot find such feature, you can do it yourself following the lecture note on “least square fit of a line” (https://cacs.usc.edu/ education/cs596/LeastSquareFit.pdf). 2. Theoretical Flop/s Performance Suppose that your computer has only one octa-core processor (a processor equipped with 8 processing cores) operating at a clock speed of 2.3 GHz (i.e., clock ticks 2.3´109 times per second), in which each core can operate 1 multiplication and 1 addition operations per clock cycle using a fused multiply-add (FMA) circuit. Assume that each multiply or add operation is performed on vector registers, each holding 4 double-precision (i.e., 4´64 = 512 bits) operands. What is the theoretical peak performance of your computer in terms of double-precision Gflop/s (gigaflop/s or 109 flop/s)? Submit the computed number. (Optional) A program named lmd_sqrt_flop.c is provided in the gzipped tar archive, cs596- as01-tar.gz, on Blackboard, along with instructions to compile and run the program in 2 RAEDME file. * This is a linked-list-cell molecular dynamics program, in which sqrt() function is implemented using a polynomial for counting the number of floating-point operations (see the lecture note on “Arithmetic implementation of sqrt() and floating-point performance”; see https://cacs.usc.edu/education/cs596/Sqrt.pdf. Compile and run the lmd_sqrt_flop.c program on a computer of your choice, and report the flop/s performance you get. Better, answer how many % of the theoretical peak flop/s performance of the computer you achieved? * To extract the files from the archive, type tar xvfz cs596-as01-tar.gz.Goal: Implement Your Own Global Summation with Message Passing Interface In this assignment, you will write your own global summation program (equivalent to MPI_Allreduce) using MPI_Send and MPI_Recv. Your program should run with P = 2l processes (or MPI ranks), where l = 0, 1,… Each process contributes a partial value, and at the end, all the processes will have the globally-summed value of these partial contributions. Your program will use a communication structure called butterfly, which is structured as a series of pairwise exchanges (see the figure below where messages are denoted by arrows). This structure allows a global reduction among P processes to be performed in log2P steps. a000 + a001 + a010 + a011 + a100 + a101 + a110 + a111 = ((a000 + a001) + (a010 + a011)) + ((a100 + a101) + (a110 + a111)) At each level l, a process exchanges messages with a partner whose rank differs only at the lth bit position in the binary representation (Fig. 1). Fig. 1: Butterfly network used in hypercube algorithms. HYPERCUBE TEMPLATE We can use the following template to perform a global reduction using any associative operator OP (such as multiplication or maximum), (a OP b) OP c = a OP (b OP c). 1,2,3 procedure hypercube(myid, input, logP, output) begin mydone := input; for l := 0 to logP-1 do begin partner := myid XOR 2l; send mydone to partner; receive hisdone from partner; mydone = mydone OP hisdone end output := mydone end USE OF BITWISE LOGICAL XOR Note that 0 XOR 0 = 1 XOR 1 = 0; 0 XOR 1 = 1 XOR 0 = 1. so that a XOR 1 flips the bit a, i.e., 2 a XOR 1 = `a a XOR 0 = a where `a is the complement of a (`a = 1|0 for a = 0|1). In particular, myid XOR 2l reverses the l-th bit of the rank of this process, myid: abcdefg XOR 0000100 = abcd`efg Note that the XOR operator is ^ (caret symbol) in the C programming language. ASSIGNMENT Complete the following program by implementing the function, global_sum, using MPI_Send and MPI_Recv functions and the hypercube template shown above. Submit the source code as well as the printout from a test run on 4 processors and that on 8 processors. #include “mpi.h” #include int nprocs; /* Number of processes */ int myid; /* My rank */ double global_sum(double partial) { /* Implement your own global summation here */ } int main(int argc, char *argv[]) { double partial, sum, avg; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myid); MPI_Comm_size(MPI_COMM_WORLD, &nprocs); partial = (double) myid; printf(“Node %d has %le ”, myid, partial); sum = global_sum(partial); if (myid == 0) { avg = sum/nprocs; printf(“Global average = %le ”, avg); } MPI_Finalize(); return 0; } References 1. Slides 20-24 in https://cacs.usc.edu/education/cs596/MPI-VG.pdf. 2. https://en.wikipedia.org/wiki/Hypercube_(communication_pattern). 3. I. Foster, Designing and Building Parallel Programs (Addison-Wesley, 1995) Chap. 11— Hypercube algorithms https://www.mcs.anl.gov/~itf/dbpp/text/node123.html.The purpose of this assignment is to acquire hands-on experience on the scalability analysis of a parallel program — one of the key skills you learn in this class. We use a simple application that utilizes the function you have written for assignment 2, where the purpose was to: (i) Convince ourselves that MPI_Send() and MPI_Recv() are sufficient to build any parallel programs, using global reduction as a concrete example. (ii) Perform a unit software test of the global_sum() function used in this assignment. Part I: Programming Write a message passing interface (MPI) program, global_pi.c, to compute the value of p based on the lecture note on “Parallel Computation of Pi” and using the global_sum() function you have implemented and unit-tested in assignment 2. Please also utilize the serial program pi.c (which computes the value of �) in the assignment 3 package. (Assignment) 1. Submit the source code of global_pi.c. (Note) • Insert MPI_Wtime() function (which takes no argument and returns the wall-clock time in seconds as double) to measure the running time of the program. Part II: Scalability In this assignment, we measure the scalability of global_pi.c. (Assignment) 2. (Fixed problem-size scaling) Run your global_pi.c with a fixed number of quadrature points, NBIN = 109 , while varying the number of compute nodes = 1, 2 and 4 with processor per node to be 1 (i.e., the number of processors P = 1, 2 and 4). Plot the fixed problem-size parallel efficiency as a function of P. Submit the plot. 3. (Isogranular scaling) In this scalability test, we consider a constant number of quadrature points, NBIN/P = 109 , per processor for P = 1, 2 and 4. To do this, we slightly modify global_pi.c by defining #define NPERP 1000000000 /* Number of quadrature points per processor */ long long NBIN; and determining the total number of quadrature points as NBIN = (long long)NPERP*nprocs; Run the resulting program global_pi_iso.c, and plot the isogranular parallel efficiency as a function of P. Submit the plot. (Note) • Please perform the entire scaling tests in a single batch job to minimize measurement fluctuations, using the Slurm script, global_pi.sl, in the assignment 3 package.The purpose of this assignment is to gain hands-on experience in practical use of message passing interface (MPI) in real-world applications, thereby consolidating your understanding of asynchronous message passing and communicators. In addition, you will get familiar with the message-passing scheme used in common spatial-decomposition applications, using the parallel molecular dynamics (MD) program, pmd.c, as an example. (Part I—Asynchronous Messages) Modify pmd.c such that, for each message exchange, it first calls MPI_Irecv, then MPI_Send, and finally MPI_Wait. The asynchronous messages make the deadlock-avoidance scheme unnecessary, and thus there is no need to use different orders of send and receive calls for even- and odd-parity processes. In addition to just MPI_Send, insert other computations that do not depend on the received messages between MPI_Irecv and MPI_Wait. • Submit the modified source code, with your modifications clearly marked. • Run both the original pmd.c and the modified program on 16 cores (requesting 4 nodes with 4 cores per node in your Slurm script), and compare the execution time for InitUcell = {3,3,3}, StepLimit = 1000, and StepAvg = 1001 in pmd.in (keep all the other parameter values as they are as downloaded from the course home page) and vproc = {2,2,4} (i.e., nproc = 16) in pmd.h. Which program runs faster? Repeat the comparison three times and report the average runtime of both programs. Submit the timing data. (Part II—Communicators) Following the lecture note on “In situ analysis of molecular dynamics simulation data using communicators,” modify pmd.c such that as many number of processes as that for MD simulations is spawned to calculate the probability density function (PDF) for the atomic velocity. • Submit the modified source code (name it pmd_split.c), with your modifications clearly marked. • Run the modified program on 16 cores (requesting 2 nodes with 8 cores per node in your Slurm script), with which 8 cores perform MD simulation and the other 8 cores calculate PDF. In pmd.h, choose vproc[3] = {2,2,2} and nproc = 8. Also, specify InitUcell = {5,5,5}, StepLimit = 30, and StepAvg = 10 in pmd.in. Submit the plot of calculated PDFs at time steps 10, 20, and 30.1. Write a hybrid MPI+OpenMP parallel molecular dynamics (MD) program (name it hmd.c), starting from the MPI parallel MD program, pmd.c, following the lecture note on “hybrid MPI+OpenMP parallel MD”. Submit the source code of hmd.c, with your modifications from pmd.c clearly marked. 2. (Verification) Run your hmd.c on two 4-core nodes (in total of 8 cores) with 2 MPI processes, each with 4 OpenMP threads, using the following input parameters: InitUcell = {24,24,12}, Density = 0.8, InitTemp = 1.0, DeltaT = 0.005, StepLimit = 100, StepAvg = 10. Use the following number of MPI processes and that of OpenMP threads, vproc = {1,1,2}, nproc = 2, vthrd = {2,2,1}, nthrd = 4, in the header file. Note the global number of atoms is: 4 atoms/unit cell ´ (24´24´12 unit cells) ´ 2 MPI processes = 55,296. Submit the standard output from the run. Make sure that the total energy is the same as that calculated by pmd.c using the same input parameters (shown below) at least for ~5-6 digits. al = 4.103942e+01 4.103942e+01 2.051971e+01 lc = 16 16 8 rc = 2.564964e+00 2.564964e+00 2.564964e+00 nglob = 55296 0.050000 0.877345 -5.137153 -3.821136 0.100000 0.462056 -4.513097 -3.820013 0.150000 0.510836 -4.587287 -3.821033 0.200000 0.527457 -4.611958 -3.820772 0.250000 0.518668 -4.598798 -3.820796 0.300000 0.529023 -4.614343 -3.820808 0.350000 0.532890 -4.620133 -3.820798 0.400000 0.536070 -4.624899 -3.820794 0.450000 0.539725 -4.630387 -3.820799 0.500000 0.538481 -4.628514 -3.820792 CPU & COMT = 3.836388e+00 2.632065e-02 3. (Scalability) Run your hmd.c on an 8-core node with one MPI process and the number of threads varying from 1, 2, 4, to 8, with input parameters: InitUcell = {24,24,24}, Density = 0.8, InitTemp = 1.0, DeltaT = 0.005, StepLimit = 100, StepAvg = 101. Plot the strong-scaling parallel efficiency as a function of the number of threads and submit the plot. (Potential Final Project) Optimize the performance of the hybrid MPI+OpenMP MD code. For example, we could enclose the entire MD loop in a parallel clause in the main function to avoid the excessive forkjoin overhead. We could also use a lock variable for synchronization.Part I: Pair-Distribution Computation with CUDA In this part, you will write a CUDA program (name it pdf.cu) to compute a histogram nhist of atomic pair distances in molecular dynamics simulation: for all histogram bins i nhist[i] = 0 for all atomic pairs (i,j) ++nhist[ë!�⃗!”!⁄Δ�û] Here, !�⃗!”! is the distance between atomic pair (i, j) and Dr is the histogram bin size. The maximum atomic-pair distance with the periodic boundary condition is the diagonal of half the simulation box, �!”# = #∑ % $%[‘] ) & ) ‘*+,-,. , and with Nhbin bins for the histogram, the bin size is Dr = Rmax/Nhbin. Here, al[a] is the simulation box size in the ath direction. With the minimal-image convention, however, the maximum distance, for which the histogram is meaningful, is half the simulation box length, min’∈{+,-,.}(��[�]/2). After computing the pair-distance histogram nhist, the pair distribution function (PDF) at distance ri = (i+1/2)Dr is defined as �(�2) = �ℎ���(�) 2��2 ) ⁄ Δ���, where r is the number density of atoms and N is the total number of atoms. (Assignment) 1. Modify the sequential PDF computation program pdf0.c to a CUDA program, following the lecture note on “Pair distribution computation on GPU”. Submit your code. 2. Run your program by reading the atomic configuration pos.d (both pdf0.c and pos.d are available at the class homepage). Plot the resulting pair distribution function, using Nhbin = 2000. Submit your plot. Part II: MPI+OpenMP+CUDA Computation of p In this part, you will write a triple-decker MPI+OpenMP+CUDA program (name it pi3.cu) to compute the value of p, by modifying the double-decker MPI+CUDA program, hypi_setdevice.cu, described in the lecture note on “Hybrid MPI+OpenMP+CUDA Programming”. Your implementation should utilize two CPU cores and two GPU devices on each compute node. This is achieved by launching one MPI rank per node, where each rank spawns two OpenMP threads that run on different CPU cores and use different GPU devices as shown in the left figure on the next page. You can employ spatial decomposition in the MPI+OpenMP layer as follows (for the CUDA layer, leave the interleaved assignment of quadrature points to CUDA threads in hypi_setdevice.cu as it is); see the right figure on the next page. #define NUM_DEVICE 2 // # of GPU devices = # of OpenMP threads … // In main() MPI_Comm_rank(MPI_COMM_WORLD,&myid); // My MPI rank MPI_Comm_size(MPI_COMM_WORLD,&nproc); // # of MPI processes omp_set_num_threads(NUM_DEVICE); // One OpenMP thread per GPU device nbin = NBIN/(nproc*NUM_DEVICE); // # of bins per OpenMP thread step = 1.0/(float)(nbin*nproc*NUM_DEVICE); … #pragma omp parallel private(list the variables that need private copies) { … mpid = omp_get_thread_num(); offset = (NUM_DEVICE*myid+mpid)*step*nbin; // Quadrature-point offset cudaSetDevice(mpid%2); … } 2Make sure to list all variables that need private copies in the private clause for the omp parallel directive. The above OpenMP multithreading will introduce a race condition for variable pi. This can be circumvented by data privatization, i.e., by defining float pid[NUM_DEVICE] and using the array elements as dedicated accumulators for the OepnMP threads (or GPU devices). To report which of the two GPUs have been used for the run, insert the following lines within the OpenMP parallel block: cudaGetDevice(&dev_used); printf(“myid = %d; mpid = %d: device used = %d; partial pi = %f ”, myid,mpid,dev_used,pid[mpid]); where int dev_used is the ID of the GPU device (0 or 1) that was used, myid is the MPI rank, mpid is the OpenMP thread ID, pid[mpid] is a partial sum per OpenMP thread. (Assignment) 1. Submit your MPI+OpenMP+CUDA code. 2. Run your code on 2 nodes, requesting 2 cores and 2 GPUs per node. Submit your output, which should look like the following: myid = 0; mpid = 0: device used = 0; partial pi = 0.979926 myid = 0; mpid = 1: device used = 1; partial pi = 0.874671 myid = 1; mpid = 0: device used = 0; partial pi = 0.719409 myid = 1; mpid = 1: device used = 1; partial pi = 0.567582 PI = 3.141588The purpose of this assignment is to gain hands-on experience in new open standards for programming heterogeneous computers accelerated by graphics processing units (GPUs) and other accelerators. Specifically, you will practice directive-based OpenMP target offload and unified data-parallel programming language, DPC++. Prerequisite We will practice both OpenMP target and DPC++ on Intel developer’s cloud (DevCloud). To do it, create your DevCloud account by registering at https://devcloud.intel.com/oneapi. Part I: OpenMP Target Offload Computation of p In this part, you will write a GPU offload program (name it omp_teams_pi.cu) to compute the value of � using omp target, teams and distribute constructs. (Assignment) 1. Modify the simple OpenMP target program omp_target_pi.cu to its teams-distribute counterpart omp_teams_pi.cu, following the lecture note on “OpenMP Target Offload for Heterogeneous Architectures”. Submit your code. 2. Compile and run your program on a GPU-accelerated computing node on DevCloud. Submit your output, which should look like the following: u49162@login-2:~$ cc -o omp_teams_pi omp_teams_pi.c -fopenmp u49162@login-2:~$ qsub -I -l nodes=1:gpu:ppn=2 qsub: waiting for job 714173.v-qsvr-1.aidevcloud to start qsub: job 714173.v-qsvr-1.aidevcloud ready u49162@s001-n181:~$ ./omp_teams_pi PI = 3.141593 Part II: DPC++ Computation of p In this part, you will experience the compilation and running processes for a DPC++ program (pi.cpp) to compute the value of �. While programming is not required for this part since C++ is not prerequisite to this class, please use this opportunity to learn the essence of C++ and DPC++ programming by going through the code and understanding why it worksfollowing the lecture note on “Data Parallel C++ (DPC++) for Heterogeneous Architectures”. (Assignment) 1. Compiler and run pi.cpp node. on a GPU-accelerated computing node on DevCloud. Submit your output, which should look like the following: Submit your output, which should look like the following: u49162@login-2:~$ dpcpp -o pi pi.cpp u49162@login-2:~$ qsub -I -l nodes=1:gpu:ppn=2 qsub: waiting for job 714154.v-qsvr-1.aidevcloud to start qsub: job 714154.v-qsvr-1.aidevcloud ready u49162@s001-n160:~$ ./pi Running on: Intel(R) Gen9 HD Graphics NEO Pi = 3.14159

$25.00 View

[SOLVED] Cm146 problem set 1 to 5 solutions

1 Maximum Likelihood Estimation [15 pts] Suppose we observe the values of n independent random variables X1, . . . , Xn drawn from the same Bernoulli distribution with parameter θ 1 . In other words, for each Xi , we know that P(Xi = 1) = θ and P(Xi = 0) = 1 − θ. Our goal is to estimate the value of θ from these observed values of X1 through Xn. For any hypothetical value ˆθ, we can compute the probability of observing the outcome X1, . . . , Xn if the true parameter value θ were equal to ˆθ. This probability of the observed data is often called the likelihood, and the function L(θ) that maps each θ to the corresponding likelihood is called the likelihood function. A natural way to estimate the unknown parameter θ is to choose the θ that maximizes the likelihood function. Formally, ˆθMLE = arg max θ L(θ). (a) Write a formula for the likelihood function, L(θ) = P(X1, . . . , Xn; θ). Your function should depend on the random variables X1, . . . , Xn and the hypothetical parameter θ. Does the likelihood function depend on the order in which the random variables are observed ? (b) Since the log function is increasing, the θ that maximizes the log likelihood `(θ) = log(L(θ)) is the same as the θ that maximizes the likelihood. Find `(θ) and its first and second derivatives, and use these to find a closed-form formula for the MLE. (c) Suppose that n = 10 and the data set contains six 1s and four 0s. Write a short program likelihood.py that plots the likelihood function of this data for each value of ˆθ in {0, 0.01, 0.02, . . . , 1.0} (use np.linspace(…) to generate this spacing). For the plot, the x-axis should be θ and the y-axis L(θ). Scale your y-axis so that you can see some variation in its value. Include the plot in your writeup (there is no need to submit your code). Estimate ˆθMLE by marking on the x-axis the value of ˆθ that maximizes the likelihood. Does the answer agree with the closed form answer ? (d) Create three more likelihood plots: one where n = 5 and the data set contains three 1s and two 0s; one where n = 100 and the data set contains sixty 1s and forty 0s; and one where n = 10 and there are five 1s and five 0s. Include these plots in your writeup, and describe how the likelihood functions and maximum likelihood estimates compare for the different data sets. 2 Splitting Heuristic for Decision Trees [14 pts] Recall that the ID3 algorithm iteratively grows a decision tree from the root downwards. On each iteration, the algorithm replaces one leaf node with an internal node that splits the data based on one decision attribute (or feature). In particular, the ID3 algorithm chooses the split that reduces 1This is a common assumption for sampling data. So we will denote this assumption as iid, short for Independent and Identically Distributed, meaning that each random variable has the same distribution and is drawn independent of all the other random variables 2 the entropy the most, but there are other choices. For example, since our goal in the end is to have the lowest error, why not instead choose the split that reduces error the most? In this problem, we will explore one reason why reducing entropy is a better criterion. Consider the following simple setting. Let us suppose each example is described by n boolean features: X = hX1, . . . , Xni, where Xi ∈ {0, 1}, and where n ≥ 4. Furthermore, the target function to be learned is f : X → Y , where Y = X1 ∨ X2 ∨ X3. That is, Y = 1 if X1 = 1 or X2 = 1 or X3 = 1, and Y = 0 otherwise. Suppose that your training data contains all of the 2n possible examples, each labeled by f. For example, when n = 4, the data set would be X1 X2 X3 X4 Y 0 0 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1 0 1 0 1 1 0 1 1 1 1 0 1 X1 X2 X3 X4 Y 0 0 0 1 0 1 0 0 1 1 0 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 (a) How many mistakes does the best 1-leaf decision tree make over the 2n training examples? (The 1-leaf decision tree does not split the data even once. Make sure you answer for the general case when n ≥ 4.) (b) Is there a split that reduces the number of mistakes by at least one? (That is, is there a decision tree with 1 internal node with fewer mistakes than your answer to part (a)?) Why or why not? (c) What is the entropy of the output label Y for the 1-leaf decision tree (no splits at all)? (d) Is there a split that reduces the entropy of the output Y by a non-zero amount? If so, what is it, and what is the resulting conditional entropy of Y given this split? 3 Entropy and Information [2 pts] The entropy of a Bernoulli (Boolean 0/1) random variable X with p(X = 1) = q is given by B(q) = −q log q − (1 − q) log(1 − q). Suppose that a set S of examples contains p positive examples and n negative examples. The entropy of S is defined as H(S) = B  p p+n  . (a) Based on an attribute Xj , we split our examples into k disjoint subsets Sk, with pk positive and nk negative examples in each. If the ratio pk pk+nk is the same for all k, show that the information gain of this attribute is 0. 3 4 Programming exercise : Applying decision trees [24 pts] Submission instructions • Only provide answers and plots. Do not submit code. Introduction2 The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. In this problem, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy. Starter Files code and data • code : titanic.py • data : titanic_train.csv documentation • Decision Tree Classifier: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html • Cross-Validation: https://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.train_test_split.html • Metrics: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html Download the code and data sets from the course website. For more information on the data set, see the Kaggle description: https://www.kaggle.com/c/titanic/data. (The provided data sets are modified versions of the data available from Kaggle.3 ) 2This assignment is adapted from the Kaggle Titanic competition, available at https://www.kaggle.com/c/ titanic. Some parts of the problem are copied verbatim from Kaggle. 3Passengers with missing values for any feature have been removed. Also, the categorical feature Sex has been mapped to {’female’: 0, ’male’: 1} and Embarked to {’C’: 0, ’Q’: 1, ’S’: 2}. If you are interested more in this process of data munging, Kaggle has an excellent tutorial available at https://www.kaggle.com/c/titanic/ details/getting-started-with-python-ii. 4 Note that any portions of the code that you must modify have been indicated with TODO. Do not change any code outside of these blocks. 4.1 Visualization [4 pts] One of the first things to do before trying any formal machine learning technique is to dive into the data. This can include looking for funny values in the data, looking for outliers, looking at the range of feature values, what features seem important, etc. (a) Run the code (titanic.py) to make histograms for each feature, separating the examples by class (e.g. survival). This should produce seven plots, one for each feature, and each plot should have two overlapping histograms, with the color of the histogram indicating the class. For each feature, what trends do you observe in the data? 4.2 Evaluation [20 pts] Now, let us use scikit-learn to train a DecisionTreeClassifier on the data. Using the predictive capabilities of the scikit-learn package is very simple. In fact, it can be carried out in three simple steps: initializing the model, fitting it to the training data, and predicting new values.4 (b) Before trying out any classifier, it is often useful to establish a baseline. We have implemented one simple baseline classifier, MajorityVoteClassifier, that always predicts the majority class from the training set. Read through the MajorityVoteClassifier and its usage and make sure you understand how it works. Your goal is to implement and evaluate another baseline classifier, RandomClassifier, that predicts a target class according to the distribution of classes in the training data set. For example, if 60% of the examples in the training set have Survived = 0 and 40% have Survived = 1, then, when applied to a test set, RandomClassifier should randomly predict 60% of the examples as Survived = 0 and 40% as Survived = 1. Implement the missing portions of RandomClassifier according to the provided specifications. Then train your RandomClassifier on the entire training data set, and evaluate its training error. If you implemented everything correctly, you should have an error of 0.485. (c) Now that we have a baseline, train and evaluate a DecisionTreeClassifier (using the class from scikit-learn and referring to the documentation as needed). Make sure you initialize your classifier with the appropriate parameters; in particular, use the ‘entropy’ criterion discussed in class. What is the training error of this classifier? (d) So far, we have looked only at training error, but as we learned in class, training error is a poor metric for evaluating classifiers. Let us use cross-validation instead. 4Note that almost all of the model techniques in scikit-learn share a few common named functions, once they are initialized. You can always find out more about them in the documentation for each model. These are some-model-name.fit(…), some-model-name.predict(…), and some-model-name.score(…). 5 Implement the missing portions of error(…) according to the provided specifications. You may find it helpful to use train_test_split(…) from scikit-learn. To ensure that we always get the same splits across different runs (and thus can compare the classifier results), set the random_state parameter to be the trial number. Next, use your error(…) function to evaluate the training error and (cross-validation) test error of each of your three models. To do this, generate a random 80/20 split of the training data, train each model on the 80% fraction, evaluate the error on either the 80% or the 20% fraction, and repeat this 100 times to get an average result. What are the average training and test error of each of your classifiers on the Titanic data set? (e) One problem with decision trees is that they can overfit to training data, yielding complex classifiers that do not generalize well to new data. Let us see whether this is the case for the Titanic data. One way to prevent decision trees from overfitting is to limit their depth. Repeat your crossvalidation experiments but for increasing depth limits, specifically, 1, 2, . . . , 20. Then plot the average training error and test error against the depth limit. (Also plot the average test error for your baseline classifiers. As the baseline classifiers are independent of the depth limit, their plots should be flat lines.) Include this plot in your writeup, making sure to label all axes and include a legend for your classifiers. What is the best depth limit to use for this data? Do you see overfitting? Justify your answers using the plot. (f) Another useful tool for evaluating classifiers is learning curves, which show how classifier performance (e.g. error) relates to experience (e.g. amount of training data). Run another experiment using a decision tree with the best depth limit you found above. This time, vary the amount of training data by starting with splits of 0.05 (5% of the data used for training) and working up to splits of size 0.95 (95% of the data used for training) in increments of 0.05. Then plot the decision tree training and test error against the amount of training data. (Also plot the average test error for your baseline classifiers.) Include this plot in your writeup, and provide a 1-2 sentence description of your observations.1 Perceptron [2 pts] Design (specify θ for) a two-input perceptron (with an additional bias or offset term) that computes the following boolean functions. Assume T = 1 and F = −1. If a valid perceptron exists, show that it is not unique by designing another valid perceptron (with a different hyperplane, not simply through normalization). If no perceptron exists, state why. (a) OR (b) XOR 2 Logistic Regression [10 pts] Consider the objective function that we minimize in logistic regression: J(θ) = − X N n=1 [yn log hθ (xn) + (1 − yn) log (1 − hθ (xn))] (a) Find the partial derivatives ∂J ∂θj . (b) Find the partial second derivatives ∂ 2J ∂θj∂θk and show that the Hessian (the matrix H of second derivatives with elements Hjk = ∂ 2J ∂θj∂θk ) can be written as H = PN n=1 hθ (xn) (1 − hθ (xn)) xnx T n . (c) Show that J is a convex function and therefore has no local minima other than the global one. Hint: A function J is convex if its Hessian is positive semi-definite (PSD), written H  0. A matrix is PSD if and only if z THz ≡ X j,k zjzkHjk ≥ 0. for all real vectors z. 2 3 Locally Weighted Linear Regression [14 pts] Consider a linear regression problem in which we want to “weight” different training instances differently because some of the instances are more important than others. Specifically, suppose we want to minimize J(θ0, θ1) = X N n=1 wn (θ0 + θ1xn,1 − yn) 2 . Here wn > 0. In class, we worked out what happens for the case where all the weights (the wn’s) are the same. In this problem, we will generalize some of those ideas to the weighted setting. (a) Calculate the gradient by computing the partial derivatives of J with respect to each of the parameters (θ0, θ1). (b) Set each partial derivatives to 0 and solve for θ0 and θ1 to obtain values of (θ0, θ1) that minimize J. (c) Show that J(θ) can also be written J(θ) = (Xθ − y) TW(Xθ − y) for an appropriate diagonal matrix W, and where X =   1 x1,1 1 x2,1 . . . . . . 1 xN,1   and y =   y1 y2 . . . yN   and θ =  θ0 θ1  . State clearly what W is. 3 4 Implementation: Polynomial Regression [20 pts] In this exercise, you will work through linear and polynomial regression. Our data consists of inputs xn ∈ R and outputs yn ∈ R, n ∈ {1, . . . , N}, which are related through a target function y = f(x). Your goal is to learn a linear predictor hθ(x) that best approximates f(x). But this time, rather than using scikit-learn, we will further open the “black-box”, and you will implement the regression model! code and data • code : regression.py • data : regression_train.csv, regression_test.csv This is likely the first time that many of you are working with numpy and matrix operations within a programming environment. For the uninitiated, you may find it useful to work through a numpy tutorial first.1 Here are some things to keep in mind as you complete this problem: • If you are seeing many errors at runtime, inspect your matrix operations to make sure that you are adding and multiplying matrices of compatible dimensions. Printing the dimensions of variables with the X.shape command will help you debug. • When working with numpy arrays, remember that numpy interprets the * operator as elementwise multiplication. This is a common source of size incompatibility errors. If you want matrix multiplication, you need to use the dot function in Python. For example, A*B does element-wise multiplication while dot(A,B) does a matrix multiply. • Be careful when handling numpy vectors (rank-1 arrays): the vector shapes 1 × N, N × 1, and N are all different things. For these dimensions, we follow the the conventions of scikit-learn’s LinearRegression class2 . Most importantly, unless otherwise indicated (in the code documentation), both column and row vectors are rank-1 arrays of shape N, not rank-2 arrays of shape N × 1 or shape 1 × N. Visualization [1 pts] As we learned last week, it is often useful to understand the data through visualizations. For this data set, you can use a scatter plot to visualize the data since it has only two properties to plot (x and y). (a) Visualize the training and test data using the plot_data(…) function. What do you observe? For example, can you make an educated guess on the effectiveness of linear regression in predicting the data? 1Try out SciPy’s tutorial (https://wiki.scipy.org/Tentative_NumPy_Tutorial), or use your favorite search engine to find an alternative. Those familiar with Matlab may find the “Numpy for Matlab Users” documentation (https://wiki.scipy.org/NumPy_for_Matlab_Users) more helpful. 2 https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html 4 Linear Regression [12 pts] Recall that linear regression attempts to minimize the objective function J(θ) = X N n=1 (hθ(xn) − yn) 2 . In this problem, we will use the matrix-vector form where y =   y1 y2 . . . yN   , X =   x T 1 x T 2 . . . x T N   , θ =   θ0 θ1 θ2 . . . θD   and each instance xn =1, xn,1, . . . , xn,DT . In this instance, the number of input features D = 1. Rather than working with this fully generalized, multivariate case, let us start by considering a simple linear regression model: hθ(x) = θ Tx = θ0 + θ1×1 regression.py contains the skeleton code for the class PolynomialRegression. Objects of this class can be instantiated as model = PolynomialRegression (m) where m is the degree of the polynomial feature vector where the feature vector for instance n,1, xn,1, x2 n,1 , . . . , xm n,1 T . Setting m = 1 instantiates an object where the feature vector for instance n,1, xn,1 T . (b) Note that to take into account the intercept term (θ0), we can add an additional “feature” to each instance and set it to one, e.g. xi,0 = 1. This is equivalent to adding an additional first column to X and setting it to all ones. Modify PolynomialRegression.generate_polynomial_features(…) to create the matrix X for a simple linear model. (c) Before tackling the harder problem of training the regression model, complete PolynomialRegression.predict(…) to predict y from X and θ. (d) One way to solve linear regression is through gradient descent (GD). Recall that the parameters of our model are the θj values. These are the values we will adjust to minimize J(θ). In gradient descent, each iteration performs the update θj ← θj − 2α X N n=1 (hθ(xn) − yn) xn,j (simultaneously update θj for all j). With each step of gradient descent, we expect our updated parameters θj to come closer to the parameters that will achieve the lowest value of J(θ). • As we perform gradient descent, it is helpful to monitor the convergence by computing the cost, i.e., the value of the objective function J. Complete PolynomialRegression.cost(…) to calculate J(θ). If you have implemented everything correctly, then the following code snippet should return 40.234. train_data = load_data(‘regression_train.csv’) model = PolynomialRegression() model.coef_ = np.zeros(2) model.cost(train_data.X, train_data.y) • Next, implement the gradient descent step in PolynomialRegression.fit_GD(…). The loop structure has been written for you, and you only need to supply the updates to θ and the new predictions ˆy = hθ(x) within each iteration. We will use the following specifications for the gradient descent algorithm: – We run the algorithm for 10, 000 iterations. – We terminate the algorithm ealier if the value of the objective function is unchanged across consecutive iterations. – We will use a fixed step size. • So far, you have used a default learning rate (or step size) of η = 0.01. Try different η = 10−4 , 10−3 , 10−2 , 0.0407, and make a table of the coefficients, number of iterations until convergence (this number will be 10, 000 if the algorithm did not converge in a smaller number of iterations) and the final value of the objective function. How do the coefficients compare? How quickly does each algorithm converge? (e) In class, we learned that the closed-form solution to linear regression is θ = (XTX) −1XT y. Using this formula, you will get an exact solution in one calculation: there is no “loop until convergence” like in gradient descent. • Implement the closed-form solution PolynomialRegression.fit(…). • What is the closed-form solution? How do the coefficients and the cost compare to those obtained by GD? How quickly does the algorithm run compared to GD? (f) Finally, set a learning rate η for GD that is a function of k (the number of iterations) (use ηk = 1 1+k ) and converges to the same solution yielded by the closed-form optimization (minus possible rounding errors). Update PolynomialRegression.fit_GD(…) with your proposed learning rate. How long does it take the algorithm to converge with your proposed learning rate? Polynomial Regression[7 pts] Now let us consider the more complicated case of polynomial regression, where our hypothesis is hθ(x) = θ T φ(x) = θ0 + θ1x + θ2x 2 + . . . + θ mx m. 6 (g) Recall that polynomial regression can be considered as an extension of linear regression in which we replace our input matrix X with Φ =   φ(x1) T φ(x2) T . . . φ(xN ) T   , where φ(x) is a function such that φj (x) = x j for j = 0, . . . , m. Update PolynomialRegression.generate_polynomial_features(…) to create an m + 1 dimensional feature vector for each instance. (h) Given N training instances, it is always possible to obtain a “perfect fit” (a fit in which all the data points are exactly predicted) by setting the degree of the regression to N − 1. Of course, we would expect such a fit to generalize poorly. In the remainder of this problem, you will investigate the problem of overfitting as a function of the degree of the polynomial, m. To measure overfitting, we will use the Root-Mean-Square (RMS) error, defined as ERMS = p J(θ)/N, where N is the number of instances.3 Why do you think we might prefer RMSE as a metric over J(θ)? Implement PolynomialRegression.rms_error(…). (i) For m = 0, . . . , 10, use the closed-form solver to determine the best-fit polynomial regression model on the training data, and with this model, calculate the RMSE on both the training data and the test data. Generate a plot depicting how RMSE varies with model complexity (polynomial degree) – you should generate a single plot with both training and test error, and include this plot in your writeup. Which degree polynomial would you say best fits the data? Was there evidence of under/overfitting the data? Use your plot to justify your answer. 3Note that the RMSE as defined is a biased estimator. To obtain an unbiased estimator, we would have to divide by n − k, where k is the number of parameters fitted (including the constant), so here, k = m + 1.1 Kernels [8 pts] (a) For any two documents x and z, define k(x, z) to equal the number of unique words that occur in both x and z (i.e., the size of the intersection of the sets of words in the two documents). Is this function a kernel? Give justification for your answer. (b) One way to construct kernels is to build them from simpler ones. We have seen various “construction rules”, including the following: Assuming k1(x, z) and k2(x, z) are kernels, then so are • (scaling) f(x)k1(x, z)f(z) for any function f(x) ∈ R • (sum) k(x, z) = k1(x, z) + k2(x, z) • (product) k(x, z) = k1(x, z)k2(x, z) Using the above rules and the fact that k(x, z) = x · z is (clearly) a kernel, show that the following is also a kernel:  1 +  x ||x|| ·  z ||z||3 (c) Given vectors x and z in R 2 , define the kernel kβ(x, z) = (1 + βx · z) 3 for any value β > 0. Find the corresponding feature map φβ(·) 1 . What are the similarities/differences from the kernel k(x, z) = (1 + x · z) 3 , and what role does the parameter β play? 2 SVM [8 pts] Suppose we are looking for a maximum-margin linear classifier through the origin, i.e. b = 0 (also hard margin, i.e., no slack variables). In other words, we minimize 1 2 ||θ||2 subject to ynθ Txn ≥ 1, n = 1, . . . , N. Parts of this assignment are adapted from course material by Tommi Jaakola (MIT), and Andrew Ng (Stanford), and Jenna Wiens (UMich). 1You may use any external program to expand the cubic. 1 (a) Given a single training vector x = (a, e) T with label y = −1, what is the θ ∗ that satisfies the above constrained minimization? (b) Suppose we have two training examples, x1 = (1, 1)T and x2 = (1, 0)T with labels y1 = 1 and y2 = −1. What is θ ∗ in this case, and what is the margin γ? (c) Suppose we now allow the offset parameter b to be non-zero. How would the classifier and the margin change in the previous question? What are (θ ∗ , b∗ ) and γ? Compare your solutions with and without offset. 3 Twitter analysis using SVMs [26 pts] In this project, you will be working with Twitter data. Specifically, we have supplied you with a number of tweets that are reviews/reactions to movies2 , e.g., “@nickjfrost just saw The Boat That Rocked/Pirate Radio and I thought it was brilliant! You and the rest of the cast were fantastic! < 3”. You will learn to automatically classify such tweets as either positive or negative reviews. To do this, you will employ Support Vector Machines (SVMs), a popular choice for a large number of classification problems. Download the code and data sets from the course website. It contains the following data files: • tweets.txt contains 630 tweets about movies. Each line in the file contains exactly one tweet, so there are 630 lines in total. • labels.txt contains the corresponding labels. If a tweet praises or recommends a movie, it is classified as a positive review and labeled +1; otherwise it is classified as a negative review and labeled −1. These labels are ordered, i.e. the label for the i th tweet in tweets.txt corresponds to the i th number in labels.txt. • held_out_tweets.txt contains 70 tweets for which we have withheld the labels. Skim through the tweets to get a sense of the data. The python file twitter.py contains skeleton code for the project. Skim through the code to understand its structure. 3.1 Feature Extraction [2 pts] We will use a bag-of-words model to convert each tweet into a feature vector. A bag-of-words model treats a text file as a collection of words, disregarding word order. The first step in building a bag-of-words model involves building a “dictionary”. A dictionary contains all of the unique words in the text file. For this project, we will be including punctuations in the dictionary too. For example, a text file containing “John likes movies. Mary likes movies2!!” will have a dictionary {’John’:0, ’Mary’:1, ’likes’:2, ’movies’:3, ’movies2’:4, ’.’:5, ’!’:6}. Note that the (key,value) pairs are (word, index), where the index keeps track of the number of unique words (size of the dictionary). 2Please note that these data were selected at random and thus the content of these tweets do not reflect the views of the course staff. 🙂 2 Given a dictionary containing d unique words, we can transform the n variable-length tweets into n feature vectors of length d by setting the i th element of the j th feature vector to 1 if the i th dictionary word is in the j th tweet, and 0 otherwise. (a) We have implemented extract_words(…) that processes an input string to return a list of unique words. This method takes a simplistic approach to the problem, treating any string of characters (that does not include a space) as a “word” and also extracting and including all unique punctuations. Implement extract_dictionary(…) that uses extract_words(…) to read all unique words contained in a file into a dictionary (as in the example above). Process the tweets in the order they appear in the file to create this dictionary of d unique words/punctuations. (b) Next, implement extract_feature_vectors(…) that produces the bag-of-words representation of a file based on the extracted dictionary. That is, for each tweet i, construct a feature vector of length d, where the j th entry in the feature vector is 1 if the j th word in the dictionary is present in tweet i, or 0 otherwise. For n tweets, save the feature vectors in a feature matrix, where the rows correspond to tweets (examples) and the columns correspond to words (features). Maintain the order of the tweets as they appear in the file. (c) In main(…), we have provided code to read the tweets and labels into a (630, d) feature matrix and (630,) label array. Split the feature matrix and corresponding labels into your training and test sets. The first 560 tweets will be used for training and the last 70 tweets will be used for testing. **All subsequent operations will be performed on these data.** 3.2 Hyperparameter Selection for a Linear-Kernel SVM [10 pts] Next, we will learn a classifier to separate the training data into positive and negative tweets. For the classifier, we will use SVMs with two different kernels: linear and radial basis function (RBF). We will use the sklearn.svm.SVC class and explicitly set only three of the initialization parameters: kernel, gamma, and C. As usual, we will use SVC.fit(X,y) to train our SVM, but in lieu of using SVC.predict(X) to make predictions, we will use SVC.decision_function(X), which returns the (signed) distance of the samples to the separating hyperplane. SVMs have hyperparameters that must be set by the user. For both linear and RBF-kernel SVMs, we will select the hyperparameters using 5-fold cross-validation (CV). Using 5-fold CV, we will select the hyperparameters that lead to the ‘best’ mean performance across all 5 folds. (a) The result of a hyperparameter selection often depends upon the choice of performance measure. Here, we will consider the following performance measures: accuracy, F1-Score, AUROC, precision, sensitivity, and specificity. Implement performance(…). All measures, except sensitivity and specificity, are implemented in sklearn.metrics library. You can use sklearn.metrics.confusion_matrix(…) to calculate the other two. (b) Next, implement cv_performance(…) to return the mean k-fold CV performance for the performance metric passed into the function. Here, you will make use of SVC.fit(X,y) and SVC.decision_function(X), as well as your performance(…) function. 3 You may have noticed that the proportion of the two classes (positive and negative) are not equal in the training data. When dividing the data into folds for CV, you should try to keep the class proportions roughly the same across folds. In your write-up, briefly describe why it might be beneficial to maintain class proportions across folds. Then, in main(…), use sklearn.cross_validation.StratifiedKFold(…) to split the data for 5-fold CV, making sure to stratify using only the training labels. (c) Now, implement select_param_linear(…) to choose a setting for C for a linear SVM based on the training data and the specified metric. Your function should call cv_performance(…), passing in instances of SVC(kernel=’linear’, C=c) with different values for C, e.g., C = 10−3 , 10−2 , . . . , 102 . (d) Finally, using the training data from Section 3.1 and the functions implemented here, find the best setting for C for each performance measure mentioned above. Report your findings in tabular format (up to the fourth decimal place): C accuracy F1-score AUROC precision sensitivity specificity 10−3 10−2 10−1 100 101 102 best C Your select_param_linear(…) function returns the ‘best’ C given a range of values. How does the 5-fold CV performance vary with C and the performance metric? 3.3 Hyperparameter Selection for an RBF-kernel SVM [8 pts] Similar to the hyperparameter selection for a linear-kernel SVM, you will perform hyperparameter selection for an RBF-kernel SVM. (a) Describe the role of the additional hyperparameter γ for an RBF-kernel SVM. How does γ affect generalization error? (b) Implement select_param_rbf(…) to choose a setting for C and γ via a grid search. Your function should call cv_performance(…), passing in instances of SVC(kernel=’rbf’, C=c, gamma=gamma) with different values for C and gamma. Explain what kind of grid you used and why. (c) Finally, using the training data from Section 3.1 and the function implemented here, find the best setting for C and γ for each performance measure mentioned above. Report your findings in tabular format. This time, because we have a two-dimensional grid search, report only the best score for each metric, along with the accompanying C and γ setting. 4 metric score C γ accuracy F1-score AUROC precision sensitivity specificity How does the CV performance vary with the hyperparameters of the RBF-kernel SVM? 3.4 Test Set Performance [6 pts] In this section, you will apply the two classifiers learned in the previous sections to the test data from Section 3.1. Once you have predicted labels for the test data, you will measure performance. (a) Based on the results you obtained in Section 3.2 and Section 3.3, choose a hyperparameter setting for the linear-kernel SVM and a hyperparameter setting for the RBF-kernel SVM. Explain your choice. Then, in main(…), using the training data extracted in Section 3.1 and SVC.fit(…), train a linear- and an RBF-kernel SVM with your chosen settings. (b) Implement performance_test(…) which returns the value of a performance measure, given the test data and a trained classifier. (c) For each performance metric, use performance_test(…) and the two trained linear- and RBF-kernel SVM classifiers to measure performance on the test data. Report the results. Be sure to include the name of the performance metric employed, and the performance on the test data. How do the test performance of your two classifiers compare?Introduction Machine learning techniques have been applied to a variety of image interpretation problems. In this project, you will investigate facial recognition, which can be treated as a clustering problem (“separate these pictures of Joe and Mary”). For this project, we will use a small part of a huge database of faces of famous people (Labeled Faces in the Wild [LFW] people dataset1 ). The images have already been cropped out of the original image, and scaled and rotated so that the eyes and mouth are roughly in alignment; additionally, we will use a version that is scaled down to a manageable size of 50 by 37 pixels (for a total of 1850 “raw” features). Our dataset has a total of 1867 images of 19 different people. You will apply dimensionality reduction using principal component analysis (PCA) and explore clustering methods such as k-means and k-medoids to the problem of facial recognition on this dataset. Download the starter files from the course website. It contains the following source files: • util.py – Utility methods for manipulating data, including through PCA. • cluster.py – Code for the Point, Cluster, and ClusterSet classes, on which you will build the clustering algorithms. • faces.py – Main code for the project. Please note that you do not necessarily have to follow the skeleton code perfectly. We encourage you to include your own additional methods and functions. However, you are not allowed to use any scikit-learn classes or functions other than those already imported in the skeleton code. 1 PCA and Image Reconstruction [4 pts] Before attempting automated facial recognition, you will investigate a general problem with images. That is, images are typically represented as thousands (in this project) to millions (more generally) of pixel values, and a high-dimensional vector of pixels must be reduced to a reasonably lowdimensional vector of features. (a) As always, the first thing to do with any new dataset is to look at it. Use get_lfw_data(…) to get the LFW dataset with labels, and plot a couple of the input images using show_image(…). Then compute the mean of all the images, and plot it. (Remember to include all requested images in your writeup.) Comment briefly on this “average” face. (b) Perform PCA on the data using util.PCA(…). This function returns a matrix U whose columns are the principal components, and a vector mu which is the mean of the data. If you want to look at a principal component (referred to in this setting as an eigenface), run show_image(vec_to_image(v)), where v is a column of the principal component matrix. (This function will scale vector v appropriately for image display.) Show the top twelve eigenfaces: plot_gallery([vec_to_image(U[:,i]) for i in xrange(12)]) Comment briefly on your observations. Why do you think these are selected as the top eigenfaces? 1 https://vis-www.cs.umass.edu/lfw/ 2 (c) Explore the effect of using more or fewer dimensions to represent images. Do this by: • Finding the principal components of the data • Selecting a number l of components to use • Reconstructing the images using only the first l principal components • Visually comparing the images to the originals To perform PCA, use apply_PCA_from_Eig(…) to project the original data into the lowerdimensional space, and then use reconstruct_from_PCA(…) to reconstruct high-dimensional images out of lower dimensional ones. Then, using plotGallery(…), submit a gallery of the first 12 images in the dataset, reconstructed with l components, for l = 1, 10, 50, 100, 500, 1288. Comment briefly on the effectiveness of differing values of l with respect to facial recognition. We will revisit PCA in the last section of this project. 2 K-Means and K-Medoids [16 pts] Next, we will explore clustering algorithms in detail by applying them to a toy dataset. In particular, we will investigate k-means and k-medoids (a slight variation on k-means). (a) In k-means, we attempt to find k cluster centers µj ∈ R d , j ∈ {1, . . . , k} and n cluster assignments c (i) ∈ {1, . . . , k}, i ∈ {1, . . . , n}, such that the total distance between each data point and the nearest cluster center is minimized. In other words, we attempt to find µ1 , . . . , µk and c (1), . . . , c(n) that minimizes J(c, µ) = Xn i=1 ||x (i) − µc (i) ||2 . To do so, we iterate between assigning x (i) to the nearest cluster center c (i) and updating each cluster center µj to the average of all points assigned to the j th cluster. Instead of holding the number of clusters k fixed, one can think of minimizing the objective function over µ, c, and k. Show that this is a bad idea. Specifically, what is the minimum possible value of J(c, µ, k)? What values of c, µ, and k result in this value? (b) To implement our clustering algorithms, we will use Python classes to help us define three abstract data types: Point, Cluster, and ClusterSet (available in cluster.py). Read through the documentation for these classes. (You will be using these classes later, so make sure you know what functionality each class provides!) Some of the class methods are already implemented, and other methods are described in comments. Implement all of the methods marked TODO in the Cluster and ClusterSet classes. (c) Next, implement random_init(…) and kMeans(…) based on the provided specifications. 3 (d) Now test the performance of k-means on a toy dataset. Use generate_points_2d(…) to generate three clusters each containing 20 points. (You can modify generate_points_2d(…) to test different inputs while debugging your code, but be sure to return to the initial implementation before creating any plots for submission.) You can plot the clusters for each iteration using the plot_clusters(…) function. In your writeup, include plots for the k-means cluster assignments and corresponding cluster “centers” for each iteration when using random initialization. (e) Implement kMedoids(…) based on the provided specification. Hint: Since k-means and k-medoids are so similar, you may find it useful to refactor your code to use a helper function kAverages(points, k, average, init=’random’, plot=True), where average is a method that determines how to calculate the average of points in a cluster (so it can take on values ClusterSet.centroids or ClusterSet.medoids).2 As before, include plots for k-medoids clustering for each iteration when using random initialization. (f) Finally, we will explore the effect of initialization. Implement cheat_init(…). Now compare clustering by initializing using cheat_init(…). Include plots for k-means and k-medoids for each iteration. 3 Clustering Faces [12 pts] Finally (!), we will apply clustering algorithms to the image data. To keep things simple, we will only consider data from four individuals. Make a new image dataset by selecting 40 images each from classes 4, 6, 13, and 16, then translate these images to (labeled) points: 3 X1, y1 = util.limit_pics(X, y, [4, 6, 13, 16], 40) points = build_face_image_points(X1, y1) (a) Apply k-means and k-medoids to this new dataset with k = 4 and initializing the centroids randomly. Evaluate the performance of each clustering algorithm by computing the average cluster purity with ClusterSet.score(…). As the performance of the algorithms can vary widely depending upon the initialization, run both clustering methods 10 times and report the average, minimum, and maximum performance. average min max k-means k-medoids How do the clustering methods compare in terms of clustering performance and runtime? 2 In Python, if you have a function stored to the variable func, you can apply it to parameters arg by callling func(arg). This works even if func is a class method and arg is an object that is an instance of the class. 3There is a bug in fetch lfw version 0.18.1 where the results of the loaded images are not always in the same order. This is not a problem for the previous parts but can affect the subset selected in this part. Thus, you may see varying results. Results that show the correct qualitative behavior will get full credit. 4 Now construct another dataset by selecting 40 images each from two individuals 4 and 13. (b) Explore the effect of lower-dimensional representations on clustering performance. To do this, compute the principal components for the entire image dataset, then project the newly generated dataset into a lower dimension (varying the number of principal components), and compute the scores of each clustering algorithm. So that we are only changing one thing at a time, use init=’cheat’ to generate the same initial set of clusters for k-means and k-medoids. For each value of l, the number of principal components, you will have to generate a new list of points using build_face_image_points(…).) Let l = 1, 3, 5, . . . , 41. The number of clusters K = 2. Then, on a single plot, plot the clustering score versus the number of components for each clustering algorithm (be sure to label the algorithms). Discuss the results in a few sentences. Some pairs of people are more similar to one another and some more different. (c) Experiment with the data to find a pair that clustering can discriminate very well and another pair that it finds very difficult (assume you have 40 images for each individual). Describe your methodology (you may choose any of the clustering algorithms you implemented). Report the two pairs in your writeup (display the pairs of images using plot_representative_images), and comment briefly on the results.1 AdaBoost [5 pts] In the lecture on ensemble methods, we said that in iteration t, AdaBoost is picking (ht , βt) that minimizes the objective: (h ∗ t (x), β∗ t ) = arg min (ht(x),βt) X n wt(n)e −ynβtht(xn) = arg min (ht(x),βt) (e βt − e −βt ) X n wt(n)I[yn 6= ht(xn)] +e −βt X n wt(n) We define the weighted misclassification error at time t, t to be t = P n wt(n)I[yn 6= ht(xn)]. Also the weights are normalized so that P n wt(n) = 1. (a) Take the derivative of the above objective function with respect to βt and set it to zero to solve for βt and obtain the update for βt . (b) Suppose the training set is linearly separable, and we use a hard-margin linear support vector machine (no slack) as a base classifier. In the first boosting iteration, what would the resulting β1 be? 2 K-means for single dimensional data [5 pts] In this problem, we will work through K-means for a single dimensional data. (a) Consider the case where K = 3 and we have 4 data points x1 = 1, x2 = 2, x3 = 5, x4 = 7. What is the optimal clustering for this data ? What is the corresponding value of the objective ? Parts of this assignment are adapted from course material by Jenna Wiens (UMich) and Tommi Jaakola (MIT). 1 (b) One might be tempted to think that Lloyd’s algorithm is guaranteed to converge to the global minimum when d = 1. Show that there exists a suboptimal cluster assignment (i.e., initialization) for the data in the above part that Lloyd’s algorithm will not be able to improve (to get full credit, you need to show the assignment, show why it is suboptimal and explain why it will not be improved). 2 3 Gaussian Mixture Models [8 pts] We would like to cluster data {x1, . . . , xN }, xn ∈ R d using a Gaussian Mixture Model (GMM) with K mixture components. To do this, we need to estimate the parameters θ of the GMM, i.e., we need to set the values θ = {ωk, µk , Σk} K k=1 where ωk is the mixture weight associated with mixture component k, and µk and Σk denote the mean and the covariance matrix of the Gaussian distribution associated with mixture component k. If we knew which cluster each sample xn belongs to (we had complete data), we showed in the lecture on Clustering that the log likelihood l is what we have below and we can compute the maximum likelihood estimate (MLE) of all the parameters. l(θ) = X n log p(xn, zn) = X k X n γnk log ωk + X k (X n γnk log N(xn|µk , Σk) ) (1) Since we do not have complete data, we use the EM algorithm. The EM algorithm works by iterating between setting each γnk to the posterior probability p(zn = k|xn) (step 1 on slide 26 of the lecture on Clustering) and then using γnk to find the value of θ that maximizes l (step 2 on slide 26). We will now derive updates for one of the parameters, i.e., µj (the mean parameter associated with mixture component j). (a) To maximize l, compute ∇µj l(θ): the gradient of l(θ) with respect to µj . (b) Set the gradient to zero and solve for µj to show that µj = P 1 n γnj P n γnjxn. (c) Suppose that we are fitting a GMM to data using K = 2 components. We have N = 5 samples in our training data with xn, n ∈ {1, . . . , N} equal to: {5, 15, 25, 30, 40}. We use the EM algorithm to find the maximum likeihood estimates for the model parameters, which are the mixing weights for the two components, ω1 and ω2, and the means for the two components, µ1 and µ2. The standard deviations for the two components are fixed at 1. Suppose that at the end of step 1 of iteration 5 in the EM algorithm, the soft assignment γnk for the five data items are as shown in Table 1. γ1 γ2 0.2 0.8 0.2 0.8 0.8 0.2 0.9 0.1 0.9 0.1 Table 1: Entry in row n and column k of the table corresponds to γnk What are updated values for the parameters ω1, ω2, µ1, and µ2 at the end of step 2 of the EM algorithm?

$25.00 View

[SOLVED] Csci 335 assignment 1 to 5 solutions

Create and test a class called Chain. A chain is just a series of items, e.g. [2 7 -1 43] is a chain containing four integers. The purpose of this assignment is to have you create a vector-like class from scratch, however, so you may not use vector, list or other classes from the STL here. Please note that chains are similar in many respects to vectors of items. Unless you get permission from me, don’t include any libraries except iostream and cstdlib. Pay special attention to Weiss’s “big five,” the destructor, copy constructor, copy assignment operator, move constructor and move assignment operator. Include cout statements at the beginning of the constructors and assignment operators in order to see when these functions are being called. When your class is complete, the following code should work, with results as commented. Insert this piece of code as is in your testing function. Chain a, b, c; //Three empty chains are created Chain d{10}; // A chain containing just one element: 10 cout b; // User types [8 4 2 1] c=a; // Copy assignment cout

$25.00 View

[SOLVED] Cmsi 281/371 assignment 1 to 4 solutions

The assignment will focus on the concept of Chaikin, Bézier curve algorithm. Given a headshot photo of an individual (e.g. the headshot of Ed Sheeran), generate the cartoon version of the photo by sketching it using Chaikin or Bézier curves. Skeleton code has been provided to guide you along the way. The places that you will be required to implement has been marked with a TODO. I have provided you with a simple Vertex class that allows you to specify the x and y values of a point. You will utilize this class for modeling the control points of your sketch. **Note: the C++ vector class is the equivalent to a list in most other languages. You may use the push_back(Object o) function of the vector class to hold your set of points. You will complete the following functions for the assignment: 1) generate_points : a function that generates takes in a set of control points for your Chaikin or Bézier curve algorithm and returns the new set of control points parameters: vector returns: vector 2) draw_curve : calls generate_points to generate the control points using the Chaikin or Bézier curve algorithm and forms a curve by connecting the points with lines parameters: vector, int returns: none The parameter n_iter refers to the number of iterations to run the Chaikin or Bézier algorithm. Recall that each time the algorithm is ran, you will obtain a set of new points. Submission: You will submit the following to Bright Space 1) “assignment1.cpp” 2) Your sketch in JPEG, JPG, or PNG: results.{jpeg, jpg, png} 3) The photo your sketch was based on in JPEG, JPG or PNG: photo.{jpeg, jpg, png} Grading: I will be compiling the assignment using the following command: gcc -o assignment1 assignment1.cpp -std=c++14 -lGL -lGLU -lglut Your code must compile for me to assign points! Your assignment will be graded on: 1) 80% the correctness of your implementation of Bézier’s algorithm 2) 20% effort placed recreating the subject via your sketch e.g. a simple happy face does not do Ed Sheeran justice Late Policy: For each day the assignment is late, 50% of its worth will be deducted, e.g. 100% on time, 50% 1 day late, 25% 2 days late, etc.Given a set of data points that describe a cube centered at the origin in 3-dimensional space, our goal is to rotate the cube about an axis. This assignment will introduce Vertex Arrays in OpenGL for modeling sets of points in 3D space. The skeleton code is provided to guide the assignment. I have provided you with a simple degree to radian function for converting a given degree theta to radian as input to your rotation matrix. In addition, you are also given a vector2array function that converts a vector of GLfloat to an array of GLfloat to be rendered. **Note: the C++ vector class is the equivalent to a list in most other languages. You will complete the following functions for the assignment marked with TODO: 1) to_homogenous_coord : converts a vector of cartesian coordinates (x, y, z) to homogeneous coordinates (x, y, z, 1) 2) to_cartesian_coord : converts a vector of homogeneous coordinates (x, y, z, 1) to cartesian coordinates (x, y, z) 3) rotation_matrix_x : outputs the rotation matrix along the x-axis 4) rotation_matrix_y : outputs the rotation matrix along the y-axis 5) rotation_matrix_z : outputs the rotation matrix along the z-axis 6) mat_mult : performs matrix multiplication between two matrices The camera has been set up for you to point towards the origin. I have provided points (vector) which contains the set of points defining the cube in 3D space. Your goal is to apply some rotation to the points in 3D space. I have also defined an array of GLfloat called colors. These colors are mapped to the planes so that you will be able to distinguish each plane of the cube. You will notice that there is a global variable theta. theta defines the degree of rotation, which you will need to convert to radian using the provided deg2rad function. Submission: You will submit the following to Bright Space 1) “assignment2.cpp” 2) A recording (avi, mov) of the program running (should show a cube rotating about the axis or axes of your choice (e.g. rotate about x and y axes) Grading: I will be compiling the assignment using the following command: gcc -o assignment2 assignment2.cpp -lGL -lGLU -lglut Your code must compile for me to assign points! Your assignment will be graded on: 1) the correctness of your implementation of the above functions Late Policy: For each day the assignment is late, 50% of its worth will be deducted, e.g. 100% on time, 50% 1 day late, 25% 2 days late, etc.Given a real world scene, your goal is to replicate it using hierarchical modeling of the objects. You will begin by building prisms from planes using 3D translations and rotations. These prisms can then be used to form parts of objects. The skeleton code is provided in a separate assignment3.cpp file I have provided you with a function init_plane which initializes a square plane of unit lengths centered at the origin (0, 0, 0). You will utilize this initial plane to form cube, which can then be transformed and used to form objects. I have also provided you with the deg2rad, vector2array, to_homogeneous_coord, to_cartesian_coord, and init_color functions. ● deg2rad — converts degree to radians ● vector2array — converts a vector into an array ● to_homogeneous_coord — converts Cartesian coordinates to homogeneous coordinates ● to_cartesian_coord — converts homogeneous coordinates to Cartesian coordinates ● init_color — creates a color map for the scene As vector2array dynamically allocates space and copies the elements of the vector into the array. You need to remember to deallocate the arrays created via vector2array after you have rendered your scene to prevent memory leak. Since arrays are static, it is easier to work with the vector class before producing the final array of vertices that will be used for rendering. You will need to first implement the build_cube function, which creates an unit cube. Then apply transformations to the cube to create objects in the scene (init_scene). Once the scene is built you will apply rotation to the scene such that the entire scene spins while the camera stays still. You may set the parameters of your camera in init_camera. For the mat_mult function, please implement it such that it multiplies a transformation matrix A by the entire points matrix B rather than transformation A and point by point in B. The math header file that I have included should be sufficient in doing the operations needed for this project. You will complete the following functions for the assignment: 1) translation_matrix (float dx, float dy, float dz) 2) scaling_matrix (float sx, float sy, float sz) 3) rotation_matrix_x (float theta) 4) rotation_matrix_y (float theta) 5) rotation_matrix_z (float theta) 6) mat_mult(vector A, vector B) 7) build_cube() 8) init_camera() 9) init_scene() 10) display_func() Note: functions 1-7 serve as helper functions to generate the objects in the scene (i.e. build_cube will create a cube in which you can then modify to form parts of an object). I would suggest using vector2array as the final step since going from an array to vector requires you to keep track of the number of elements in the array (which could be hard after generating thousands of points). Submission: You will submit the following to Bright Space: 1) assignment2.cpp 2) Either 6 different viewpoints taken from different angles of you scene submitted as JPEG, JPG or PNG with the names: view{1,2,3,4,5,6}.{jpeg, jpg, png} or a short video panning over regions of your scene 3) A photo in the form of JPEG, JPG or PNG (scene.{jpeg, jpg, png}) taken of the scene with which you are basing your model on Grading: I will be compiling the assignment using the following command: g++ -o assignment3 assignment3.cpp -lglut -lGLU -lGL Your code must compile for me to assign points! Your assignment will be graded on: 1) the correctness of your implementation of the hierarchical modeling procedure 2) the correctness of your implementation of your camera and scene modeling 3) effort placed recreating the real world scene Late Policy: For each day the assignment is late, 50% of its worth will be deducted, e.g. 100% on time, 50% 1 day late, 25% 2 days late, etc.Given a scene, our goal is to generate objects using hierarchical modeling (based on assignment 3) and add shading (colors) to these objects by defining a light source (a 3-dimensional vector), computing the surface normals, and generating the observed colors using Gouraud shading. Note that we define the illumination (e.g. the observed colors) as follows: Note: you may reuse the objects you defined in assignment 3, but you also MUST make sure that you have sufficient objects in the scene to make it look interesting (at least a few colors) Once you have defined an object, I would suggest keeping the operations that generate the object as a standalone function (e.g. build_chair(…) for generating a chair) such that you may call these functions to quickly prototype a scene. We will be working with an ObjectModel class which contains 4 vector objects, holding information about the points defining each plane, their respective base colors, their normals (each point on the plane should have the same normal) and the actual observed colors. I have provided you with an overloaded function init_base_color, which you can use to define the base colors for each plane (to color a cube, for example, you will need to call this 6 times). There are 3 portions for this assignment: 1) generating surface normals, 2) applying Gouraud shading, 3) rendering objects with colors based on camera and light source positions. The next section of the specification will detail the road map for these portions. Surface Normals In order to know how light will interact with an object, we must know the object’s surface normal, which can be obtained using the cross product of two vectors on the object surface (which is a plane where normals are well-defined). You will implement the cross_product method and use it to implement the generate_normals method. Gouraud Shading We will implement our shading equation which models how an object will be illuminated. You will implement the dot_product method in order to measure the strength of the light rays reflecting off an object. We will use the dot_product to implement the apply_shading method. Note: for this assignment we are implementing Gouraud shading, which includes the interpolation of the colors between each point defining the surface. Luckily the interpolation is already taken care of by OpenGL so all we will need to do is to define the observed colors for each point via the shading equation. Rendering Objects with Colors In the previous assignment, we simply let the colors be defined as a static base color value (randomly generated). We now know how light behaves and how we should model its interaction with objects. Hence, we will need to populate each member of ObjectModel. The scene will be constructed using the defined points. The set of base colors of the scene will be stored in base_colors. The apply_shading method will be used to generate the actual observed colors using the surface normals, light source, and camera. The observed colors are stored as the color member of ObjectModel. Submission: You will submit the following to Bright Space: 1) assignment4.cpp 2) Either 6 different viewpoints taken from different angles of you scene submitted as JPEG, JPG or PNG with the names: view{1,2,3,4,5,6}.{jpeg, jpg, png} or a short video panning over regions of your scene 3) A photo in the form of JPEG, JPG or PNG (scene.{jpeg, jpg, png}) taken of the scene with which you are basing your model on Grading: I will be compiling the assignment using the following command: g++ -o assignment4 assignment4.cpp -lglut -lGLU -lGL Your code must compile for me to assign points! Your assignment will be graded on: 1) the correctness of your implementation of the surface normals 2) the correctness of your implementation of the illumination equation 3) effort placed recreating the real world scene Late Policy: For each day the assignment is late, 50% of its worth will be deducted, e.g. 100% on time, 50% 1 day late, 25% 2 days late, etc.

$25.00 View

[SOLVED] BSTA011 - BUSINESS STATISTICS

BSTA011 - BUSINESS STATISTICS GROUP ASSIGNMENT This assignment is designed to assist you to achieve the following learning outcomes: a.   Apply appropriate quantitative analytical techniques to qualify, support, select and evaluate data as information for use in business decision-making. b.   Interpret and communicate results of quantitative analyses for business decision-making. c.    Use a computer-based data analysis package (i.e. Excel) to critically analyse data. Assignment value: 20% Group of 3-5 students. The group members must be from the same tutorial. Your tutor will put you in groups in tutorials. In the event that you cannot find any group, let your Lecturer/tutor know asap. You are NOT allowed to complete this assignment by yourself or in groups of less than 3 members. Submission: Submission Due date What to submit How to submit Soft copy 1 1:59 PM Sunday, 07/12/2025 CANVAS Your team have been accepted as interns at Landcom. Landcom manages strategic and complex residential projects. Your first job is to conduct an analysis based on the recent sales price of the three suburbs of New South Wales for the years 2021, 2022 and 2023 from All Homes or real estate. Your team needs to perform a comprehensive statistical analysis of the suburbs, which your tutors will suggest. TASK 1: LOCATE AND SELECT DATA Q1. Collect and Compute the appropriate descriptive statistics of the “sold house price”, “Sold house land size”, and “sold house number of rooms” for the years 2021,2022 and 2023 of the suburban selected by your tutor. The descriptive statistics measures include central tendency (mean), variability (standard deviation), Mode, Quartiles, Range, and Interquartile range and show the infographics (e.g., pie chart, bar chart, etc.) of 2021, 2022 and 2023 data for the following variables: (a) Sold house price (b) Sold house land size (c) Sold house number of rooms The sample size should be at least 30 for each year (2021, 2022 and 2023) for each suburb. So, for one suburb, the total at least the number of houses recorded should be 90 for a three-years period. TASK 2: DATA DESCRIPTION AND ANALYSIS Q2. Based on the descriptive statistics from Q1, briefly comment on the central tendency and variability of three suburbs for 2021, 2022 and 2023 Combine data from all group members in an Excel spreadsheet and use this collated sample to answer the following questions. Q3. Choose one suburb and perform. the following task from 2021 and 2022 data: The historical data indicates that the high house prices (more than the average price; You should have the average house price of each suburb from question 1) are more likely to be associated with land size as compare to low house price (Below average house price). What is the probability of a high house price given that the house land size is extended (more than the average land size for the suburb)? What is the probability of low house prices given that the land size is non-extended (Land size below average)? Analyse your collated sample and examine whether it is indeed the case. Show the steps in your analysis (including justification for choice of techniques used and all calculations) and report your findings clearly and use a probability matrix. Table: Probability matrix for 2021 High house price Low House price Total Extended Land size Non- extended land size Total Grand total Table: Probability matrix for 2022 High house price Low House price Total Extended Land size Non- extended land size Total Grand total Q4. (a) Choose one suburb and perform the following task from 2021,2022 and 2023 data. It is a common perception that the land size and the number of rooms available influence the house price. Investigate  the  following  relationships  using  multiple  linear  regression  analysis.  (i) Explore the relationship between land size and the house price, (ii) Explore the relationship between the available number of rooms and the house price. Use the multiple linear regression model and interpret the  result  of  p-values  of  independent  variables,  multiple  R,  Adjusted  R-squared, physical meaning of co-efficient and significance of “f“statistics. (b) Using the suburb selected for part (a), conduct a regression analysis with the house prices and external economic factors (cash rate target, inflation rate, and unemployment rate) for the years 2021, 2022, and  2023,  utilizing  data  from  the Reserve  Bank  of  Australia  (RBA). Apply  multiple  linear regression models to examine the relationships between these variables. Interpret the statistical measures derived from the regression models, including the multiple R, adjusted R-squared, and the significance of the F-statistic. Evaluate the importance of the independent variables by interpreting their p-values. Develop two distinct regression models for task (a) and (b). Q5. Choose one suburb and perform. the following task from 2022 data: Analyze the frequencies of two variables (House price level and land size) with multiple categories to determine whether the two variables are independent. Conduct Chi-Square Hypothesis test at 0.05 level to ensure that, whether house price level and land size are independent. Use the following table for Chi - square test: Land size House price level Total High house price Low house price Extended land size Non-Extended land size Grand total Q6. What is the average house price of each selected suburb for 2023 (Use the house price average from question 1 and construct a 95% confidence interval for the average house price for each selected  suburb  of  New  South  Wales  for  the  year  2023)?  Note:  The  population  standard deviation of house prices in New South Wales is $20,000. Q7. A recent study has claimed that the average house price in New South Wales is $1,187,200. Use your collected data to test this claim for each selected suburb for the year 2023 (Note: Use the sample statistics from question 1). Note: The population standard deviation of house prices in New South Wales is $20,000. Is there any evidence to suggest that the average house price has changed at a 5% significance level? Report your findings with clear conclusions and all supporting calculations. Q8. Develop a rating for the three suburbs assigned to you, based on the provided crime statistics from Sydney Suburban Review . Complete the following tasks: (i) Define a rating scale for the suburbs based on crime rates. For example: • A: Suburbs with the lowest crime rates. • B: Suburbs with moderate crime rates. • C: Suburbs with the highest crime rates. To establish the ratings, calculate the average crime incidents for NSW based on 408 suburbans crime average incident provided on Sydney Suburban Review . Suburbs with crime rates above the average should be categorized as C, those around the average as B, and those below the average as A. (ii)Create visual representations, such as bar charts or maps, to display the crime ratings for each suburb. (iii) Investigate the influence of crime rates on house prices. Provide a well-supported argument based on relevant evidence and research findings.

$25.00 View

[SOLVED] Microelectronics Circuit Analysis and Design

1) Design a lithium battery pack discharge overheating protection circuit which senses the temperature of a 7.4V lithium battery pack (which consists of two 3.7V batteries in series) and disconnects it from the load which it's powering if the pack gets too hot, and reconnects it again when it cools. You should use a thermistor to sense temperature, and may use any commercially available op-amps, comparators, FETs, BJTs, Zener (your choice of voltage) and other types of diodes, resistors and any capacitors or inductors you need. The thermistor has a resistance of greater than 10 kΩ when the battery is cool and reduces to below 8 kΩ when too hot; your circuit should disconnect when the resistance decreases to below 8 kΩ and reconnect when the resistance increases again to above 10 kΩ. Your circuit should operate from the voltages available from the two series-connected batteries in the pack (and you have access to the terminal between the two series-connected batteries).For highest credit on this problem give manufacturer part numbers for all semiconductors (such as are given in the Analog Devices parts kit). 2) Design a ten-bit D/A converter which generates an output voltage in the range of 0 to 1.023 V (i.e. 1 mV LSb). You may use op-amps, comparators, FETs, BJTs, Zener (your choice of voltage) and other types of diodes, resistors and any capacitors or inductors you need. Assume that you have whatever power supply voltages you need and that the digital inputs are Ov for logic O and +5V for logic 1.

$25.00 View

[SOLVED] IB Mathematics Internal AssessmentSPSS

IB Mathematics Internal Assessment From Pendulums to Potential Fields: A Comparative Analysis of Critical Points in Physical Systems Across Dimensions Word Count: 3927 Table of Contents 1. Introduction 2. Differentiation of One-Variable Functions •  2.1. Examples of 1D Curves and 2D Surfaces in Physics •  2.2. Defining Critical Points in 1D Curves – 2.2.1. Maxima and Minima in Physical Systems – 2.2.2. Examples of Critical Points in Mechanical Energy – 2.2.3. Inflection Points and Transition States – 2.2.4. Examples of Concavity Change in Motion 2.3. Calculation and Classification of Critical Points • 2.3.1. Calculating Maxima and Minima in Potential Energy • 2.3.2. First Derivative Test for Physical Equilibrium •  2.3.3. Example: Simple Pendulum Analysis •  2.3.4. Calculating Inflection Points in Kinematics • 2.3.5. Finding All Critical Points in Oscillatory Systems •  2.3.6. The Second Derivative Test for Stability 3. Transitioning to Analyzing Critical Points in 2D Surfaces •  3.1. Visualization of 2D Physical Surfaces • 3.2. Defining Critical Points in Potential Energy Landscapes • 3.3. Finding Critical Points in 2D Physical Systems — 3.3.1. Example: Particle in Magnetic Field 3.4. Classifying the Nature of Critical Points • 3.4.1. The Hessian Matrix in Physical Context • 3.4.2. The Determinant and Stability Criterion •  3.4.3. Example: Saddle Points in Electrostatic Potentials • 3.4.4. Experimenting with Different Physical Potentials • 3.4.5. Example: Gravitational Potential Analysis 4. Conclusion • 4.1. Comparing the Dimensions in Physical Contexts • 4.2. Generalization to Higher-Dimensional Physical Systems Bibliography 1 Introduction The study of critical points in calculus finds profound applications in understanding physical phenomena, from the simple oscillation of a pendulum to the complex energy landscapes of molecular systems. My fascination with this topic began during physics laboratory sessions, where I observed how mathematical concepts directly translate to physical behavior. This investigation explores how critical point analysis extends from one-dimensional me- chanical systems to two-dimensional potential fields, addressing the research question: How does the mathematical framework for critical point analysis evolve when tran- sitioning from one-dimensional to two-dimensional physical systems, and what new physical insights emerge from this dimensional expansion? Through concrete physical examples and mathematical rigor, this paper demonstrates the powerful connection between abstract calculus and tangible physical reality. 2    Differentiation of One-Variable Functions 2.1 Examples of 1D Curves and 2D Surfaces in Physics pendulum_potential . png Figure 1: Potential energy curve of a simple pendulum V (θ) = mgl(1 - cosθ) 2.2    Defining Critical Points in 1D Curves 2.2.1    Maxima and Minima in Physical Systems In physical contexts, critical points represent equilibrium positions: • Minima: Stable equilibrium (system returns after small displacement) • Maxima: Unstable equilibrium (system moves away after small displacement) 2.2.2    Examples of Critical Points in Mechanical Energy For a mass-spring system with potential energy V (x) = 1/2kx2 : V ′ (x) = kx = 0 ⇒ x = 0 (1) V ′′(x) = k > 0 ⇒ stable equilibrium (2) 2.2.3    Inflection Points and Transition States In  physical  systems,  inflection  points  often  represent  transition  states  between  different regimes of behavior. 2.3    Calculation and Classification of Critical Points 2.3.1    Calculating Maxima and Minima in Potential Energy Consider a particle in an anharmonic oscillator with potential: (3) 2.3.2    First Derivative Test for Physical Equilibrium The first derivative test identifies where net force vanishes: (4) 2.3.3 Example: Simple Pendulum Analysis As detailed in the theoretical framework, the simple pendulum demonstrates clear physical interpretation of mathematical critical points. 2.3.4    Calculating Inflection Points in Kinematics In motion analysis, inflection points in displacement-time graphs indicate acceleration changes. 3 Transitioning to Analyzing Critical Points in 2D Sur-faces 3.1 Visualization of 2D Physical Surfaces potential_surface. png Figure 2: Two-dimensional potential energy surface showing multiple critical points 3.2    Defining Critical Points in Potential Energy Landscapes In two dimensions, critical points satisfy: (5) 3.3 Finding Critical Points in 2D Physical Systems 3.3.1    Example: Particle in Magnetic Field Consider a charged particle in a magnetic field with potential: (6) Finding critical points: (7) (8) 4    Classifying the Nature of Critical Points 4.1    The Hessian Matrix in Physical Context The Hessian matrix encodes curvature information crucial for stability analysis: (9) 4.2 The Determinant and Stability Criterion The Hessian determinant classifies critical points: • D > 0, Vxx  > 0:  Stable equilibrium (minimum) • D > 0, Vxx  < 0: Unstable equilibrium (maximum) • D < 0: Saddle point (mixed stability) 4.3 Example: Saddle Points in Electrostatic Potentials Consider the electrostatic potential: (10) Critical point at (0, 0) with Hessian: (11) This saddle point represents an unstable equilibrium where the potential decreases in y-direction but increases in x-direction. 4.4    Experimenting with Different Physical Potentials 4.4.1    Gravitational Potential Analysis For a mass in a gravitational field with additional quadrupole moment: (12) This complex potential demonstrates multiple critical points with different stability char- acteristics. 5 Conclusion 5.1    Comparing the Dimensions in Physical Contexts The transition from one-dimensional to two-dimensional analysis reveals fundamental in- sights: Table 1: Comparison of Critical Point Analysis in Physical Systems Aspect 1D Systems 2D Systems Equilibrium Types Minima, Maxima Minima, Maxima, Saddle Points Stability Analysis Single direction Multiple directions Physical Examples Pendulum, Spring Molecular conformations, Field potentials Mathematical Tools Second derivative Hessian matrix Complexity Simple Rich, anisotropic behavior 5.2    Generalization to Higher-Dimensional Physical Systems The principles established extend naturally to higher dimensions: • Three-dimensional potential fields in electromagnetism •  Multi-dimensional configuration spaces in statistical mechanics •  High-dimensional energy landscapes in machine learning The emergence of saddle points in two dimensions represents a crucial conceptual ad- vancement, enabling understanding of transition states and anisotropic stability that are ubiquitous in real physical systems. Bibliography 1.  Goldstein, H., Poole, C., & Safko, J. (2002).  Classical Mechanics  (3rd ed.).  Addison- Wesley. 2.  Marion, J. B., & Thornton, S. T. (2004).  Classical Dynamics of Particles and Systems (5th ed.). Brooks/Cole. 3.  Stewart, J. (2015).  Calculus:  Early  Transcendentals  (8th ed.).  Cengage Learning. 4.  Feynman,  R.  P.,  Leighton,  R.  B.,  &  Sands,  M.  (2005).    The  Feynman  Lectures  on Physics. Addison-Wesley. 5.  Kibble, T. W. B., & Berkshire, F. H. (2004).  Classical Mechanics  (5th ed.).  Imperial College Press.

$25.00 View

[SOLVED] Assignment 5

Assignment #5 Assignment Overview Black Friday is almost here! You are managing the local electronics store, and you know that Black Friday is always a crazy day in your store. You want to ensure that the shopping experience is a safe one for everyone. To that end, you've decided to control the flow of customers into and within your store. Customers will be grouped at one of two entryways, and then will be escorted to the area of the store they wish to shop in to maintain some semblance of order. ● Escort Team O can accommodate up to 100 shoppers. ● Escort Team 1 can accommodate up to 50 shoppers. There are 500 shoppers lined up at the main entrance (Zone O), and they all want to get to one of the other four zones. ● 50 shoppers want to get to Zone 1 (Appliances). ● 100 shoppers want to get to Zone 2 (TVs). ● 250 shoppers want to get to Zone 3 (Smartphones). ● 100 shoppers want to get to Zone 4 (Video Games). The 500 shoppers waiting at the main entrance (Zone O) will need to be escorted to the other four zones by one of the two escort teams. However, only one escort team can be in any zone at one time to avoid confusion. For this assignment, write a C program which will simulate this activity. Purpose ● Learn how to use multi-threading and mutual exclusion to safely update shared values. ● Get experience with either o pthread_mutex_init(), pthread_mutex_lock(), pthread_mutex_unlock() and pthread_mutex_destroy () system functions; OR o sem_init (), sem_wait (), sem_post () and sem_destroy () system functions. ● Gain more experience with the C programming language from an OS perspective. Instructions Attached to this assignment is a tarball with the following files in it. None of these files should be modified: Makefile

$25.00 View

[SOLVED] Media analysis project

Media analysis project Topic: How do Chinese and American media create opposing narratives in the TikTok security controversy? Media analysis project (25%) The objective of this assignment is to apply the knowledge gained during the semester to analyze the role of the media in a conflict of your choice. You will conduct a media analysis project, and present findings in a written report. This report will be between 2500 and 3000 words (excluding bibliography). Based on the work done for the background paper , you will: l Decide what type(s) of media to focus on (e.g. newspapers, social media,  cable news, radio; local, national, international; fringe/mainstream etc.). l Settle on 1 to 3 questions to ask about the role of the media in relation to the given conflict l Develop a method to gather media material and analyze it. There will be ample opportunities to discuss methods throughout the course as we encounter various methods over the weeks. You’ll be able to draw on the knowledge gained from the replication study. I can also direct you towards method readings relevant for your individual projects. The report should include a presentation of, and rationale for, the key questions; a section about data gathering and analysis method, a presentation of findings and, finally, a discussion of the findings. Your research paper should be 2500-3000 words long (excluding the bibliography). You should include a fully formatted bibliography using the APA style. (see here for a quick guide). There is no limit on how many references you can include and no strict minimum either, but an indicative number is somewhere around 10 references.. Key criteria for grading include: l Coherence of the argument l Relevance and robustness of the methodological approach l Clarity of the writing l Submitted on time l Formatting as per guidelines.

$25.00 View

[SOLVED] PM600 Research Project

Assessment Task Information Key details: Assessment title: Written assignment (individual): Research Questions & Literature Review Module Name: Research Project Module Code: PM600 Assessment will be set on: Cycle 2, Week 1 Feedback opportunities: Before deadline: Peer feedback in class, tutor feedback in class 1:1 After deadline: Written feedback available on Turnitin two weeks after submission. Assessment is due on: 14/12/25 Assessment weighting: 30% Assessment Instructions What do you need to do for this assessment? In your previous assessment you produced a research project proposal on which you have received feedback from your tutor. Use this feedback to improve your project design by improving or expanding two sections of the proposal: 1. The Research Questions should be refined to ensure feasibility and clarify the focus of your research. 2. The sources in the Annotated Bibliography should be compared and contrasted with other sources in your Literature Review. These should be grouped into themes containing an evaluation of the sources used and lead to the identification of a research gap, which you will try to fill with the study you will carry out in the next stage of the project. The Literature Review should give an overview of the research done previously, including details on findings and the methodology used in the sources reviewed. This could be resubmitted to a potential research supervisor. The aim of these two sections is to provide appropriate focus of your research project and literature-based evidence which can help persuade your research supervisor about the importance and need for this research project. Guidance: For this assessment you should make use of the following formative activities that you have already completed.  These activities have been designed to support this assessment: · Week 7: Research Proposal assessment feedback · Week 9: Peer Review Literature Review Worksheet · Week 10: Literature Review 1:1s with class tutor Please note: Both parts of this assessments are individual tasks which means that you are expected to complete them by yourself. Structure: Project Title: Give your project a working title (15-20 words). Introduction: State the layout of your literature review and key points to come (maximum 150 words; optional) Section 1: Literature Review (~1200 – 1500 words) · In this section, you are to review the available academic literature on your chosen topic. You should compare and contrast existing research findings in order to identify gaps which your research aims to fill, and critically evaluate the quality of existing literature on your chosen topic. · You can organise this section into further sub-sections, such as an introduction and sub-sections dedicated to specific research themes. Themes could be different areas of your topic that you researched or different areas of literature you find in your research. Sub-section headings should reflect the relevant content of that section. Section 2: Reflection on Research Questions (~200-300 words) · This section should include your Research Questions (maximum three). · You should provide a justification of these questions and relate these to gaps in the existing literature on your chosen topic. · The research questions should be a result of the research done in the Literature Review. · This assignment should not answer your research questions. · You can also discuss your hypotheses for your project, highlighting the kinds of results you expect to find based on your reading. Reference List (excluded from the word count) In this section you should include a list of all sources you have used to complete this assignment. Theory and/or task resources required for the assessment: This is a secondary research task, requiring you to draw on sources to evaluate your research design and the topic you wish to research. Note – this does not mean your final research project must follow a secondary research approach. This is solely for the Literature Review & Research Questions assessment. You should draw on knowledge and information provided in your lessons. A range of sources relevant to your topic area is required for this assignment: you may wish to use less-formal sources for background research on your topic, e.g., newspapers, reputable websites in addition to a diverse range of academic sources available to you through your college, including journals, textbooks and chapters in edited volumes. Databases are a rich source for datasets. Any non-academic sources you use should be treated with caution. A minimum of 10 sources is required for this assignment. Referencing style.: All sources cited within the literature review should appear in the final reference list. The preferred referencing style. is APA, although your teacher may offer an alternative style. of Harvard referencing that can be used. Expected word count: You are expected to write approximately 1,500 – 1800 words to complete this assignment, following the structure outlined above.  References and project title are excluded from the word count. Learning Outcomes Assessed: · Conduct and produce a critically evaluative academic literature review appropriate to the proposed research question(s) following accepted conventions. · Critically reflect on academic skills & performance, responding appropriately to feedback to improve aspects of academic work. Submission Requirements: You must include the following paragraph on your title page: “I confirm that this assignment is my own work. Where I have referred to academic sources, I have provided in-text citations and included the sources in the final reference list.” You must type your assessment in an academically suitable font (e.g. Arial), font size 11, with 1.5 spacing. Sections and sub-section headings can be font size 14/16. You must submit the assessment electronically via the VLE module page.  Please ensure you submit it via the Turnitin VLE plug-in. When you submit a copy of your Proposal to Turnitin, you must include a title page with the following information: ✓ Module Tutor Name✓ Student Name When you submit your proposal on Turnitin, the submission title should include: Your student ID number_module code and group_tutor initials e.g. 2999999_PM600F_JB NB – If you have technical problems submitting, you should do the following: 1. Contact College Services using this form. before the deadline: https://kicpathways.formstack.com/forms/contact_gic 2. Under ‘What is your enquiry about?’, choose ‘assignment hand in’. 3. In the ‘How can we help you?’ box, write what the assignment is (Assessment 1: Research Proposal), the module (PM600), group (e.g. JAN 25 - ENG Group A), tutor’s name and date it was due in. 4. Attach your assignment and screenshot(s) of the error message. Academic Integrity & Misconduct Information: Please use this link to access more information on academic integrity and misconduct: https://pathways.kaplaninternational.com/course/view.php?id=1940 Additional submission information – check you have done the following: Formatting Consistent font, spacing, page numbers, formatting and subheadings Citations Correct format and location throughout the report Referencing Harvard referencing system used correctly in the reference list Summarising Summarising the results of research Paraphrasing Paraphrasing the contents of research findings Spell check Spell check the report Proof-reading Proof-reading completed Grammar The report has been checked for grammatical accuracy How will this assessment be marked? The following criteria will be used to evaluate your performance in this assessment: Coverage and content (25%) - How well you cover available sources and identify key content that is relevant to your research area. - How suitable and persuasive your research objectives and research questions are. Critical Appraisal (25%) - How well you evaluate and comment on the literature in your research area. This includes identifying logical connections between sources and your own research project. Organisation of ideas (25%) - How well you organise your ideas to address your key research themes and assessment objectives. - How well you structure your ideas using linking expressions, cohesive devices and narrative synthesis, as well as section headings and order. Academic Expression (10%) - How well your present your ideas in academic English and use relevant terminology for your research area. Academic Integrity (15%) - How well you follow academic conventions relating to appropriate register, paraphrasing and referencing. - How genuine, accurate and precise is/are the data/facts presented. You will receive a (%) grade for each criterion.  The overall assessment grade will be averaged from the four criteria, and the overall mark will be a percentage. You must achieve a minimum 40% to pass this assessment. How will you get feedback? Your tutor will grade the assessment and provide feedback for each criteria area. Students will usually receive assessment feedback on Turnitin two weeks after the assessment submission deadline, unless affected by holidays, term breaks and other circumstances.

$25.00 View