Need Help with this Question or something similar to this? We got you! Just fill out the order form (follow the link below), and your paper will be assigned to an expert to help you ASAP.
R Language Assignment Answer
TASK:
Question 1:
The data file film.txt contains data for five variables. The thickness of a plastic film is measured in 4 positions after being cut. The position of the measurements are: top right, top left, bottom right and bottom left.
Provide R code, output and written interpretation for parts a) to d) of this question. Provide only output that is directly relevant to address each section. Test for multivariate normality (MVN) by:
a) Describe the structure of the film.txt data.
b) Produce and interpret univariate QQ plots and histograms and univariate Shapiro- Wilks tests of normality for each of the four film thickness variables. Which is the most non-normally distributed variable?
c) Produce and interpret perspective and contour plots for the top-right and top-left film thickness variables. What is an inherent problem with using these plots to assess
MVN?
d) Do the analysis necessary to provide the results of the Mardia, Henze-Zirkler and Royston tests of MVN based on all four film thickness variables. Include in your interpretation:
The Chi-Square QQ plot and describe how it is constructed and its relationship to the univariate normal QQ plots as part of your interpretation.
What is a key limitation of these MVN statistical tests?
e) One way to try and meet the MVN assumption could be to remove some of the variables from the multivariate analysis (do not perform this analysis). Suggest three additional ways that you might improve univariate and multivariate normality for
data sets in general.
f) In part e) we suggested removing some variables to try and help the data approach MVN. Suggest one other reason why reducing the number of variables used in the multivariate analysis may be important (this question does not relate to this
particular data set)?
Question 2:
The data file iris.txt contains data for four flower characteristics variables for three species of iris.
Provide R code, output and written interpretation for parts a) to f) of this question.
a) Produce a draftsman display for the 4 flower characteristics variables. Interpret these plots, relating back to the original data where it may add to the interpretation. What are the y and x-axes on plot [3,2] of the draftsman plot?
b) In the context of MANOVA, list the dependent and independent variables and define the relationship that the MANOVA would test.
c) Produce the correlation matrix for the flower characteristics variables. Provide an interpretation of the correlations and indicate what they suggest about the potential for the variables to be MVN distributed? (do not test for MVN) (4 marks)
d) Using MANOVA in R, test for differences in flower characteristics between the three species. Include tests using all four test statistics covered in this course and interpret output (assume the assumption of MVN is met).
e) Why is a small Wilks lambda statistic likely to indicate significant differences between at least some groups? Which of the four tests used in part d) would be the best to interpret if there are concerns about multivariate normality or covariance
equality?
f) Produce output that specifically compares each of the Groups with each other using Hotellings T 2 t-test equivalent and a significance level of 0.05. Determine the multiple tests corrected significance level. Do not provide R output; instead, reproduce and complete the following table for all comparisons and interpret. How may sample sizes have affected these results and those in part c)?
Question 3:
The data file usair.dat contains data for seven air quality variables measured across 41 United States cities. Provide R code, output and written interpretation for all analyses.
a) Produce the correlation and covariance matrices. Explain the difference between these matrices in detail (i.e. explain clearly how the values are adjusted mathematically and the effect of these changes). Would using the covariance matrix
in PCA on the USair data be appropriate? Why?
b)
Perform PCA analysis on the 7 variables using the prcomp function. Discuss the eigenvalues, %variation and scree plot and how they influence your decision on how many PCs to interpret from this analysis. Remember to keep in mind the overall
purpose of PCA.
c) Interpret the first PC. Include the Z equation and a plot of the loadings on the first PC in your answer.
d) What is the correlation between the first and second PCs and what does this tell you?
e) Produce and interpret a biplot based on the first 2 PCs. In particular, explain your
interpretation of the air quality variables in city 1 compared to city 11 and city 9.
Relate your interpretation back to the original data.
f)
Was this a useful analysis for this data set? Explain.
Question 4:
For this question you will continue to use the data file usair.dat from Question 3.
Provide R code, output and written interpretation for all analyses.
a) Perform parallel analysis and evaluate how many PCs should be used in FA. Compare to your choice of number of PCs in Q3b).
b) Explain in your own words how the parallel analysis works.
c) Perform a Factor Analysis on all 7 variables (apply no rotation) using the number of factors you identified in part a). Interpret the output including the :
Variance explained
Chi-square test
Variable loadings
Difference in uniqueness values for the variables wind.speed and annual.precip
d) Repeat the FA with a varimax rotation and calculate the communalities. Interpret:
Explain the aim and features of a varimax rotation
Changes in the variable loadings
The communalities.
This R Assignment has been solved by our R experts at TVAssignmentHelp. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. Theres one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.