| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Catherine, Dima, Toni Discussion Page

Page history last edited by Tony Leonardi 13 years, 1 month ago

This our re-do of the dependent sample assignment from 2/10 Driving Reaction Time Under Distractive Conditions.docx  

 

Monday 2/7/2011

 

Function 

Action 

length(object

number of elements/components 

dim(object

dimensions of an object 

str(object

structure of an object 

class(object

class or type of an object 

mode(object

how an object is stored 

names(object

names of components in an object 

c(object, object,...) 

combines objects into a vector 

 

cbind(object, object, ...) 

combines objects as columns 

rbind(object, object, ...) 

combines objects as rows 

object 

prints the object 

head(object

list the first part of the object 

tail(object

list the last part of the object 

ls() 

list current objects 

rm(object

delete an object 

newobject <- edit(object

edit object and save as newobject 

fix(object

edit in place 

 

 

To import data from an EXCEL file  - RODBC package to access the files 

install.packages ("RODBC")

 

Webscraping - the user extracts info imbedded in a web page using:

readlines ( )         grep ( ) gsub ( ) functions

 

Sunday 2/6/2011

Example for Linear Least Squares Regression:

http://www.cyclismo.org/tutorial/R/linearLeastSquares.html

When we type varialbename<-lm(what ever values we pass), we should issue summary(variablename) to get Residuals, Coefficients, Residual standard error, ...etc. issuing variablename will only show the coefficients! plot(myvariable) will give multiple plots (should discuss how to read them!!)

 

runif(n, min=0, max=1): It let us generate random deviates; however, unlike mvrnorm we can't specify the mean and the covariance matrix. We forgot to ask how we can specify min and max limit so we don't give unwanted values!

 

useful commands:

help.search("a topic")

example("a topic")

data() shows all datasets in the loaded packages

sink(file = "filename", append = FALSE, type = c("output", "message"), split = FALSE) to write the output to the file filename :) "split = TRUE write to both the file and the screen. Not quite sure what type is?

sink(): return the output back to the screen.

source("myfile") will submit the code from myfile to the current R session.

png("myfile.png") to redirect graphic output to myfile. We have to issue the drawing command (plot() for example) and then dev.off(). we can also use jpeg(), bmp() or pdf().

help(package="packagename"): to list the functions and datasets included in packagename

 

 

 

 

 

Shoe Wear Graphics

library(MASS)

data(shoes)

granova.ds(data.frame(shoes),revc=TRUE, main="Shoe Wear",xlab="Sole material B",ylab="Sole material A")

 

 "More about comparing Two independent samples." Assignment:

To create the dataframe: Check http://msenux.redwoods.edu/math/R/dataframe.php for more information.

musicTypeA=c(26,21,22,26,19,22,26,25,24,21,23,23,18,29,22)

musicTypeB=c(18, 23, 21, 20, 20, 29, 20, 16, 20, 26, 21, 25, 17, 18, 19)

Lowry.A.B.dt=data.frame(A=musicTypeA,B=musicTypeB)

 

The updated function "es.twogps" that includes calculating the CI: 

 

es.twogps <- function(x, y)

 

{#generates standardized ES for input  

 

 # x,y (indep.) sample data

 

mx <- mean(x)

 

my <- mean(y)

 

nx <- length(x)

 

#nb: nx will generally NOT equal ny

 

ny <- length(y)

 

md <- mx - my

 

vpoold <-((nx-1)*var(x) +(ny-1)* 

 

var(y))/ (nx + ny - 2)

 

sdpoold <- sqrt(vpoold)

 

ES <- md/sdpoold

 

       md<-round(md,2)

 

       ES<-round(ES,2)

 

GM=22

 

CIL=GM-(2*sdpoold)

 

CIU=
GM+(2*sdpoold)
 

 

list(md=md,ES=ES,CIL=CIL,CIU=CIU)

 

Questions:

  1. What is: (in simple linear regression)
    • influence statistics
    • Residuals standard error
    • F-statistic
    • Multiple R-squared
    • Adjusted R-squared

Comments (5)

bob pruzek said

at 6:56 pm on Feb 6, 2011

I will provide an alternative way, maybe two, for calculating a ( 95% ) confidence interval for the population mean difference. The direct method you've used is not recommended (tho' it might, as we shall see, be close
numerically to what I shall provide) More on this Monday. bp

Dima Kassab said

at 7:33 pm on Feb 6, 2011

Thanks Bob!

bob pruzek said

at 1:05 am on Feb 7, 2011

Dima (or ??), You asked: "Questions:
What is: (in simple linear regression)
influence statistics
Residuals standard error
F-statistic
Multiple R-squared
Adjusted R-squared"
We have not covered regression yet, so these questions will have to wait a bit for detailed answers. But I will bring a book or two that should help in the meantime. And I can provide many URLs that will also help
if you want a head start.As for some of the other things on this page, I'm not sure if there are q's you want me to answer, but if any of you (in the whole class) do, just make the q's explicit. BP

bob pruzek said

at 1:18 am on Feb 7, 2011

BTW, to simulate data it is probably best to use rnorm or rt functions, and not runif(n, min=0, max=1):
Real data rarely conform to what runif generates! bp

Dima Kassab said

at 1:36 am on Feb 7, 2011

Indeed, these questions are topics any one of our small group (Catherine, Dima and Tony) encounters and doesn't have a clear idea about! We post them here to keep track about areas we need to learn more about. Eventually if no one of our group has an answer and the questions aren't covered in class we'll make the q's explicit.
Thanks for your concern Bob.

You don't have permission to comment on this page.