Need Help with this Question or something similar to this? We got you! Just fill out the order form (follow the link below), and your paper will be assigned to an expert to help you ASAP.
Question Description
Question 11 pts
SQL is an acronym for _____________.
Group of answer choicesSpeed Query Language
Special Query Language
Structured Query Language
Statistics Query Language
Flag this QuestionQuestion 21 pts
For a prediction task with 7 features, what would be a desirable sample size?
Group of answer choices70
50
35
7
Flag this QuestionQuestion 31 pts
The response (outcome) variable is whether the customer defaulted on their debt. The
predictor is the average balance and gender (gender = 1 if male and 0 if female). The following
R output is the fitted logistic regression model.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -10.87 0.5170250 -21.03 <2e-16 ***
balance 0.001744 0.0002494 6.99 <2e-16 *** genderM -0.631 0.1153 -5.47 <2e-16 ***
—
From the output, will a male customer with the average debt of $4000 default?Group of answer choicesCannot determine since each customer is different.
The customer is very unlikely to default.
The customer is very likely to default.
The customer will default.
Flag this QuestionQuestion 41 pts
Classification and Regression Trees (CART) is __________.
Group of answer choicesa supervised learning technique to classify binary outcomes
a supervised learning technique to classify the data value into a given category
a supervised learning technique for classification that is based on tree-based
None of these
Flag this QuestionQuestion 51 pts
k-nearest neighbor (kNN) is ______________.
Group of answer choicesa supervised learning technique to classify binary outcomes
a supervised learning technique to classify the data value into a given category
a supervised learning technique for classification that is based on tree.
None of these
Flag this QuestionQuestion 61 pts
Consider the dataset
1.3, 1.5, 2.6, 2.8, 3.3, 4.0, 8.9.
Using a rule of +/-3, it 8.9 an outlier? The mean and standard deviation are 3.5 and 2.57, respectively.
Group of answer choicesYes
No
Maybe
Cannot be determined
Flag this QuestionQuestion 71 pts
Which one of the following is NOT a data mining algorithm after completion of sampling, handling missing values, and dealing with outliers?
Group of answer choicesDimension reduction
Imputation
Feature representation
Feature scaling
Flag this QuestionQuestion 81 pts
Consider the following dataset. Compute the distance between the 1st and 3rd sample.
sampleaciditydensityalcoholpH100.99689.83.220.040.9979.83.2630.560.9989.83.16Group of answer choices0.110
0.315
0.561
0.735
Flag this QuestionQuestion 91 pts
The response (outcome) variable is whether the customer defaulted on their debt. The
predictor is the average balance. The following R output is the fitted logistic regression model.
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -10.87 0.5170250 -10.08 <2e-16 ***
balance 0.001744 0.0002494 11.12 <2e-16 ***
—
From the output, will a customer with the average debt of $8000 default?Group of answer choicesCannot determine since each customer is different.
The customer is very unlikely to default.
The customer is very likely to default.
The customer will default.
Flag this QuestionQuestion 101 pts
Data mining is ____________.
Group of answer choicesa domain-specific language in programming and it is designed to handle structured data
the process of collecting, organizing, analyzing, interpreting and presenting the data
the process of discovering patterns in large datasets to predict outcomes
all of the above
