7COM1073 – Foundations of Data Science & Analysis Along With Python Code – Data Science Assignment Help

Responsive Centered Red Button

Need Help with this Question or something similar to this? We got you! Just fill out the order form (follow the link below), and your paper will be assigned to an expert to help you ASAP.

Assignment Task –                 

 

The programming language you should use to finish this assessment is Python (Version 3 and above only). In particular, you can use functions from the following packages: Numpy, Pandas, Matplotlib, Seaborn, and Sklearn.

 

Task 1:

Data pre-processing and data exploration

a. Use Pandas to load the data and report the number of data points (rows) in the dataset.

b. Consider “quality” as class labels. Report the number of features in the dataset and the number of data points in each class

c. Perform random permutations of the data using the function, shuffle, from sklearn.utils. You must set a value to the parameter, random_state. Assign the data to a new variable as white_wine.

d. Produce one scatter plot, that is, one feature against another feature. You are free to choose which two features you want to use.

 

Task 2: PCA Analysis on the white-wine dataset Using Scikit-Learn.

a. Perform a PCA analysis on the whole white_wine dataset.

b. Plot the data in the PC1 and PC2 projections and label/colour the data in the plot according to their class label.

 

Task 3:

Divide the white_wine dataset into a training set,

a validation set, and a test set. a. Take out the first 1000 rows from white_wine and save it as the validation set.

b. Take out the last 1000 rows from white_wine and save it as the test set.

c. Save the rest of rows from white_wine as the training set.

 

Task 4:

Investigate how the size of the training dataset affects the model performance on the test set. In this task, let us consider the last column ‘quality’ of the white_wine dataset as a real-valued target rather than a class label. You need to use the linear regression model to finish the following tasks (a)- (c). Note that you should use all available features in the dataset.

a. Produce a learning curve of the size of training set against the performance measurements. The performance should be measured on both the training set and the validation set. You need to choose at least 10 different sizes for the training set. For example, the first size may be 10% of the total training set produced in Task 3. • Remember to scale the corresponding training set and the validation set.

b. Report what the best training data size you would like to use for this work is and explain why you choose it.

c. report the performance on the test set obtained using the model trained from the best size.

• Remember to scale the corresponding training and test sets.

 

Task 5:

Critical Discussion: write your conclusions using critical thinking (no more than 150 words) in your Jupyter notebook submission.

a. Summarize your findings for each task.

b. For Task 4, discuss whether there is any problem with that experimental design. If there is, what is it? How may you further improve it so that the experimental results are more reliable?

 

This 7COM1073 – Data Science Assignment has been solved by our Data Science Experts at TVAssignmentHelp. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment Experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

How to create Testimonial Carousel using Bootstrap5

Clients' Reviews about Our Services