• If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

View

# Catherine, Dima, Toni Discussion Page

last edited by 9 years, 6 months ago

This our re-do of the dependent sample assignment from 2/10

Monday 2/7/2011

 Function Action length(object) number of elements/components dim(object) dimensions of an object str(object) structure of an object class(object) class or type of an object mode(object) how an object is stored names(object) names of components in an object c(object, object,...) combines objects into a vector

 cbind(object, object, ...) combines objects as columns rbind(object, object, ...) combines objects as rows object prints the object head(object) list the first part of the object tail(object) list the last part of the object ls() list current objects rm(object) delete an object newobject <- edit(object) edit object and save as newobject fix(object) edit in place

To import data from an EXCEL file  - RODBC package to access the files

install.packages ("RODBC")

Webscraping - the user extracts info imbedded in a web page using:

readlines ( )         grep ( ) gsub ( ) functions

Sunday 2/6/2011

Example for Linear Least Squares Regression:

http://www.cyclismo.org/tutorial/R/linearLeastSquares.html

When we type varialbename<-lm(what ever values we pass), we should issue summary(variablename) to get Residuals, Coefficients, Residual standard error, ...etc. issuing variablename will only show the coefficients! plot(myvariable) will give multiple plots (should discuss how to read them!!)

runif(n, min=0, max=1): It let us generate random deviates; however, unlike mvrnorm we can't specify the mean and the covariance matrix. We forgot to ask how we can specify min and max limit so we don't give unwanted values!

useful commands:

help.search("a topic")

example("a topic")

data() shows all datasets in the loaded packages

sink(file = "filename", append = FALSE, type = c("output", "message"), split = FALSE) to write the output to the file filename :) "split = TRUE write to both the file and the screen. Not quite sure what type is?

sink(): return the output back to the screen.

source("myfile") will submit the code from myfile to the current R session.

png("myfile.png") to redirect graphic output to myfile. We have to issue the drawing command (plot() for example) and then dev.off(). we can also use jpeg(), bmp() or pdf().

help(package="packagename"): to list the functions and datasets included in packagename

Shoe Wear Graphics

﻿library(MASS)

data(shoes)

granova.ds(data.frame(shoes),revc=TRUE, main="Shoe Wear",xlab="Sole material B",ylab="Sole material A")

"More about comparing Two independent samples." Assignment:

musicTypeA=c(26,21,22,26,19,22,26,25,24,21,23,23,18,29,22)

musicTypeB=c(18, 23, 21, 20, 20, 29, 20, 16, 20, 26, 21, 25, 17, 18, 19)

Lowry.A.B.dt=data.frame(A=musicTypeA,B=musicTypeB)

The updated function "es.twogps" that includes calculating the CI:

es.twogps <- function(x, y)

{#generates standardized ES for input

# x,y (indep.) sample data

mx <- mean(x)

my <- mean(y)

nx <- length(x)

#nb: nx will generally NOT equal ny

ny <- length(y)

md <- mx - my

vpoold <-((nx-1)*var(x) +(ny-1)*

var(y))/ (nx + ny - 2)

sdpoold <- sqrt(vpoold)

ES <- md/sdpoold

md<-round(md,2)

ES<-round(ES,2)

GM=22

CIL=GM-(2*sdpoold)

CIU=
GM+(2*sdpoold)

list(md=md,ES=ES,CIL=CIL,CIU=CIU)

Questions:

1. What is: (in simple linear regression)
• influence statistics
• Residuals standard error
• F-statistic
• Multiple R-squared #### bob pruzek said

at 6:56 pm on Feb 6, 2011

I will provide an alternative way, maybe two, for calculating a ( 95% ) confidence interval for the population mean difference. The direct method you've used is not recommended (tho' it might, as we shall see, be close
numerically to what I shall provide) More on this Monday. bp #### Dima Kassab said

at 7:33 pm on Feb 6, 2011

Thanks Bob! #### bob pruzek said

at 1:05 am on Feb 7, 2011

Dima (or ??), You asked: "Questions:
What is: (in simple linear regression)
influence statistics
Residuals standard error
F-statistic
Multiple R-squared
We have not covered regression yet, so these questions will have to wait a bit for detailed answers. But I will bring a book or two that should help in the meantime. And I can provide many URLs that will also help
if you want a head start.As for some of the other things on this page, I'm not sure if there are q's you want me to answer, but if any of you (in the whole class) do, just make the q's explicit. BP #### bob pruzek said

at 1:18 am on Feb 7, 2011

BTW, to simulate data it is probably best to use rnorm or rt functions, and not runif(n, min=0, max=1):
Real data rarely conform to what runif generates! bp #### Dima Kassab said

at 1:36 am on Feb 7, 2011

Indeed, these questions are topics any one of our small group (Catherine, Dima and Tony) encounters and doesn't have a clear idea about! We post them here to keep track about areas we need to learn more about. Eventually if no one of our group has an answer and the questions aren't covered in class we'll make the q's explicit.