kddata
Home
Book
Videos
Q&A

Sign in | Ask a new question | Sort: by views or by time
Questions: 170 | Resolved: 170 | Answers: 263 | Views : 76864 | Avg time to answer : 1.4 days




38459
views
874
days
1
answers
Getting an error when running KNN algorithm on Project




559
views
936
days
4
answers
Error: undefined columns selected, when running dummy(x)




508
views
896
days
6
answers
Not Enough Distinct Predictions to compute area under ROC.




457
views
906
days
3
answers
R error: infinite or missing values in 'x' when running svd(subs_mat)




417
views
928
days
1
answers
Large Graded Partner Assignment Question




400
views
854
days
1
answers
Random Forest error on calibration function




374
views
903
days
3
answers
Question about the NYSE dataset for Large Assignment - heavily skewed dependent variable




373
views
875
days
10
answers
Error running LR with glmnet




367
views
853
days
1
answers
WHAT DATA




349
views
935
days
1
answers
Can you clarify the following statement: Compute the the sum by both ID1 and ID2 ? -HW1 Exercise4 (due 2/1)-




347
views
873
days
1
answers
Error from AUC function on KNN algorithm




337
views
882
days
7
answers
Euclidean Distance for K-Nearest Neighbors




320
views
935
days
4
answers
How to keep the same structure of a data frame when appying functions to it?




306
views
855
days
1
answers
Text mining after SVD




304
views
904
days
1
answers
How to use your newly computed v and d to calculate u on the second half of the data?




301
views
901
days
1
answers
Meaning of negative AUC (in Round 1 results)




295
views
932
days
1
answers
How to compute the time window in model deployment?




288
views
931
days
1
answers
How to compute a lag on a variable?




285
views
904
days
1
answers
Output of the KNN algorithm question -> Prediction or probability




283
views
905
days
3
answers
Warning in install.packages : package ‘impute’ is not available (for R version 3.2.3)




279
views
931
days
1
answers
Best Way to Attempt Homework




279
views
931
days
1
answers
did you mean to say flag or lag in the exercise?




276
views
853
days
7
answers
predKNN Error: "Error in matrix(data = predKNN, ncol = k, nrow = nrow(testKNN)) : non-numeric matrix extent"




276
views
903
days
1
answers
Prediction Algorithm Output and what it means: If all my predNB values are under 0.50, is it ever guessing "YES?"




275
views
867
days
1
answers
RandomForest Prediction in Project




275
views
895
days
1
answers
Lead variable for predictions




275
views
882
days
4
answers
Create a bagged tree and tune the ensemble size




273
views
853
days
1
answers
Output for 5x2fcv function?




273
views
883
days
2
answers
2.8.4 Exercises for Subsection 2.5.5




270
views
932
days
1
answers
What is the difference between characters, numerics, factors, and integers?




269
views
868
days
6
answers
Error Received in Decision Tree




267
views
931
days
1
answers
Can you please clarify what you are asking for when you want predictors from 5 columns?




263
views
882
days
3
answers
Max AUC (bagged trees)




263
views
901
days
1
answers
How do we write the predict function to give predictions for specific stocks?




262
views
879
days
1
answers
How to manually compute KNN




262
views
881
days
1
answers
K-Nearest Neighbors "for loop"




260
views
902
days
3
answers
HW#2: Calculating Second Half of SUBS dataset




259
views
909
days
1
answers
How to manually compute proportions of 1s using k nearest neighbors in R?




258
views
882
days
1
answers
kNN Calculated Proportions




257
views
874
days
1
answers
GLM/GLMNET Multinomial/Binomial error




257
views
840
days
1
answers
Logistic Regression Log Lambda / Coefficients Graph Explanation




254
views
855
days
1
answers
Optimizing bins for calibration




253
views
894
days
3
answers
Predicting Using Naive Bayes




253
views
910
days
1
answers
What is the response variable for the Group Stock Price Project?




250
views
903
days
1
answers
Large Assignment: Input / Output and function deliverables




249
views
894
days
2
answers
Threading in R




248
views
897
days
1
answers
Coaching Marketplace: Can coaches meet with different teams?




248
views
878
days
1
answers
Leading variable




247
views
847
days
1
answers
Passing variables between functions




247
views
863
days
1
answers
Big Project Articles Text mining: Merging 2 data sets




246
views
868
days
1
answers
Bagged Trees Prediction




245
views
840
days
1
answers
Exam length




244
views
882
days
1
answers
taking Y of the nearest neighbors




242
views
906
days
1
answers
What is the purpose of svd?




242
views
847
days
2
answers
DATA_GRADING data from 2002 missing




240
views
894
days
2
answers
AUC




240
views
882
days
1
answers
What is meant by "Estimation phase: Create a big ensemble with many trees only once"




240
views
908
days
1
answers
Breaking a "tie" for K-nearest neighbors




240
views
895
days
5
answers
Reading .csv files as Dataframe




237
views
896
days
1
answers
Question sapply about




236
views
861
days
2
answers
How to execute R code inside a string variable?




235
views
875
days
1
answers
Difference between BasetableTRAIN, BasetableVAL, BasetableTEST, and BasetableTRAINbig




232
views
908
days
1
answers
Why do we sometimes use the transpose function back to back, and how do we recognize when it is appropriate to do so?




232
views
897
days
1
answers
kNN.index vs. knn




230
views
846
days
1
answers
Comprising a 5x2cfv




230
views
878
days
1
answers
Why use AUC?




228
views
845
days
1
answers
Summarizing Cross-Validation Performance Values




228
views
895
days
2
answers
Sorting by stock symbol




228
views
932
days
1
answers
What does putting "as." in front of numeric or factor accomplish?




228
views
901
days
1
answers
How to use validation method on two separate data sets (test data= jan2nd & train data=prev.yr data+jan1st data)




228
views
859
days
1
answers
Is it better to delete row of NA's or impute mode?




228
views
896
days
1
answers
What to do if a rm command was used when it shouldn't have been?




227
views
854
days
3
answers
Not enough memory to run RF on the Stocks data




227
views
881
days
1
answers
Using NaiveBayes() function that does not have the right criteria




225
views
878
days
1
answers
Dependent Variable




224
views
931
days
1
answers
How to combine two functions with differing outputs and equations?




223
views
931
days
1
answers
Homework due dates




223
views
855
days
1
answers
Choosing a good title




223
views
841
days
1
answers
Explain node impurity further




222
views
872
days
1
answers
Text mining: how to aggregate unstructured data in Large Assignment News Articles




222
views
896
days
3
answers
Coaching Marketplace




219
views
855
days
1
answers
Does the Wilcoxon-Ranks Test function need to only work for 10 AUC scores at the .05 level?




219
views
882
days
1
answers
Finding the Optimal # of Trees




219
views
882
days
1
answers
Second intermediate deliverable: predictions




218
views
865
days
1
answers
Large assignment question about merging




218
views
856
days
1
answers
Using Random Forest




218
views
873
days
1
answers
submitting LR




217
views
855
days
1
answers
How to select number of dates in test set




215
views
699
days
1
answers
Test Material Usage




214
views
897
days
1
answers
kNN in class dependent variable




214
views
722
days
2
answers
Viewing Prep Quiz answers




214
views
897
days
1
answers
First Prediction Wednesday Model




214
views
868
days
1
answers
Using the mapping.csv file with Article Data




212
views
845
days
1
answers
AUC ROC calculation explored




212
views
881
days
2
answers
HW#3 Question 1: Medium Tenure, High Spend




211
views
868
days
3
answers
Tuning in Decision Trees




210
views
849
days
3
answers
Reading in new data




210
views
867
days
1
answers
Round 4 Why are there so many unmatched Symbols?




209
views
880
days
1
answers
Discriminating between neighbors in K-nearest neighbors where distance measures are equal




208
views
851
days
1
answers
what does it mean - return indicators




207
views
870
days
3
answers
Determining optimal number of trees - Exercise 2.8.6




207
views
851
days
1
answers
Clarification on Final Code Deliverable




207
views
851
days
1
answers
What do we do if our calibrated predictions are incorrect?




206
views
847
days
2
answers
New data passed with ensemble of models?




206
views
860
days
1
answers
yTEST




206
views
909
days
1
answers
What is the significance of document vector weighting?




205
views
878
days
1
answers
Independent and dependent variables for predictions




205
views
725
days
1
answers
Viewing of the Solutions to the In-Class Gradable




205
views
847
days
2
answers
How to combine multiple models for final submission




204
views
852
days
1
answers
Number of bins for plotting Calibration and bin classifier scores




204
views
881
days
1
answers
Computing Probabilities manually with Bayes Theorem.




203
views
847
days
1
answers
Matching symbols with predictions in the predict function




202
views
867
days
1
answers
Round 3 Data




202
views
886
days
1
answers
Predicting Probabilities of New (X,Y) Using Naive Bayes




201
views
878
days
1
answers
How to manually calculate AUC?




201
views
853
days
1
answers
How do we structure a data frame for stacking models?




201
views
848
days
1
answers
When to install.packages for grader




200
views
856
days
2
answers
The datasets.Rdata file location




200
views
843
days
2
answers
Requesting a summary of results from Stock Prediction final submission




199
views
847
days
1
answers
Should I subtract colMeans from my DTM matrix when I am recreating it in the predict function?




199
views
868
days
1
answers
Optimal Ensemble Size Exercise 2.8.6 for section 2.5.7




198
views
852
days
1
answers
Getting a better smoothed fit for plot




198
views
870
days
1
answers
Neural Network Calculation: Entropy Term, etc.




198
views
860
days
1
answers
Manually Evaluate Neural Network




197
views
701
days
1
answers
Defining the classes of columns while reading in data




196
views
877
days
1
answers
Loop with letters instead of numbers




195
views
856
days
1
answers
What's the difference between Predictions and Predictions 2




195
views
847
days
1
answers
Splitting a Table Without Hardcoding




193
views
868
days
1
answers
Defining Validation data vs training data




192
views
835
days
1
answers
Cross Validation Optimal Parameter Consistency




192
views
670
days
1
answers
Second Test Allowed Material Usage




191
views
544
days
3
answers
what do you mean when you use the word 'handling'?




189
views
866
days
1
answers
Inner merging DF with cardinality of many-to-many




186
views
860
days
1
answers
NAs in Dataset from Lead Variable




185
views
848
days
1
answers
columns.txt file in Data Grading folder




184
views
848
days
1
answers
MERGE() command Techniques




184
views
729
days
5
answers
Computer Error on the First Prep Quiz




181
views
835
days
2
answers
Cross Validation & Wilcoxon Solutions




179
views
851
days
1
answers
Wilcoxon assingment




179
views
665
days
1
answers
"idf" function




176
views
851
days
1
answers
graphing function takes labels as argument




171
views
708
days
1
answers
How would && operators increase efficiency?




171
views
850
days
1
answers
Wilcoxon signed-ranks test: Value of parameter "p" in the critical values table




171
views
851
days
1
answers
Deploying the model given by the calibrate function on unbinned or binned data?




152
views
638
days
1
answers
When do we perform variable selection outside of regression?




146
views
622
days
1
answers
How select the sequence parameter to use in a loop while trying to determine the optimal K for a KNN model




131
views
513
days
3
answers
Do we need to train the model on train+test set again after finding optimal ensemble size?




130
views
578
days
1
answers
Data Frame subsetted with logical types FALSE, TRUE




129
views
498
days
1
answers
Features in Support Vector Machines




128
views
472
days
1
answers
Support Vector Machines and figure 2.15




127
views
531
days
1
answers
Figure 2.8 and equation 2.24




125
views
531
days
2
answers
Good Predictions




123
views
533
days
1
answers
The difference between using predict() on BasetableVAL and BasetableTEST




122
views
538
days
3
answers
Naive Bayes: How can we tell when to tune data?




122
views
541
days
1
answers
Viewing files before reading into R using notepad




120
views
541
days
1
answers
combining rbind and lapply -- the reasoning behind not using them together




120
views
511
days
1
answers
Bagged Decision Tree vs. Random Forest




119
views
531
days
1
answers
Double colon?




118
views
502
days
2
answers
ICA #19 Solution




118
views
507
days
1
answers
Something forgotten before an important dealine




118
views
533
days
2
answers
In binomial likelihood function why is number of trials equal to 1




118
views
536
days
1
answers
Creating identical percentages of 1's and 0's efficiently




114
views
484
days
1
answers
Understanding classifier performance




107
views
492
days
1
answers
AUACC vs AUROC curves




4
views
19
days
0
answers





4
views
19
days
0
answers





4
views
19
days
0
answers





3
views
19
days
0
answers





3
views
19
days
0
answers





3
views
19
days
0
answers